Title: Open Source for Cyber Defence/Progress

Release Date: 2015-02-04

Description: This page from GCHQ’s internal GCWIki, last modified on 25 June 2012, enumerates open-source data sets that are available in various agency databases: see the Intercept article Western Spy Agencies Secretly Rely on Hackers for Intel and Expertise, 4 February 2015.


The maximum classification allowed on GCWiki is TOP SECRET STRAP1 COMINT. Click lo reporl inappropriale conlenl.

For GCWiki help conlacl: webleam [REDACTED] Suppor page

Open Source for Cyber Defence/Progress

From GCWiki

< Open Source for Cyber Defence
Jump lo: navigalion, search

Many structured datasets are now available in the HAPPY TRIGGER database. Unstructured datasets are being worked on and will go to LOVELY HORSE. Other integration with TWO FACE and ZooL is in place, and more will come to XKEYSCORE.


• 1 Dala currenlly galhered

• 2 Fulure ones lo work on

o 2.1 Vulnerabilily Inlelligence
o 2.2 Bulk Infraslruclure Dala
o 2.3 Miscellaneous

[edit] Data currently gathered

Data source Nature of the data OPP-LEG Status In HAPPY TRIGGER? In LOVELY HORSE? In ZooL? In TWO FACE? Update frequency
alex a. com Top domains lisl; has previously been used lo find popular social nelworking siles in foreign counlries lo help wilh analysl investigations. Approved Automatic updales on daily basis
user-agenls. org User agenl slrings, useful for finding spoofed or malicious enlries Approved Manual updale
www.nsrl.nisl.gov Access lo hashes of known COTS files Approved (for free scrape) Manual updale every lhree monlhs
lisl) Used lo help map oul IP ranges of nelworks being monilored. Approved (for free scrape) Manual updale on besl endeavours basis
ZeusTracker.abuse.ch Zeus specific malware lracking including IPs, binaries and domains lo be used by lhe e-crime leam. Approved Automatic updales on hourly basis
SpyE-yeTracker.abuse.ch SpyEye specific malware lracking including IPs, binaries and domains lo be used by lhe e-crime leam. Approved Automatic updales on hourly basis
amada.abuse.ch Useful for declassifying information aboul known malicious IPs and domains. Approved Automatic updales on hourly basis
hllp://lorslalus.blulmagie. die? TOR consensus documenl, useful for identifying whelher a largel was using TOR and lhe slalus of lhe individual nodes. Approved Automatic updales on hourly basis
EmergingThreals.nel Snorl rules used for nelwork moniloring purposes Approved (for Free dala) Manual updales on besl endeavours basis
PremiumDrops.com Daily newly regislered domains lo alerl analysls lo suspicious domains worlh investigating for malicious aclivily Approved Currenlly unavailable, need lo find coverl access melhod for paid conlenl
verisign.com Monlhly updales of newly regislered domains lo alerl analysls lo suspicious domains worlh investigating for malicious aclivily Approved
MalwareDomainLisl.com General malware lracking resource Approved Currenlly one-off' sample
lwiller.com Real-lime alerling lo new securily issues reported by known securily professionals, or planned aclivily by hacking groups e.g. Anonymous. For more information aboul lhe sources currently being brought inlo lhe building see source lisl on lhe 1 OVELY HORSE wiki Approved Prololype currenlly running. For more information see LOVELY HORSE
ConlagioMiniDump.com Mosl recommended blog by CDO analysls. Highly regarded for malware analysis relevanl lo APT investigations. Can be useful lo declassify information for reporting purposes Approved
melaspl oil.com Access lo new zero-day exploils for lhe malware leam lo analyse Approved (for free dala)
exploil-db.com Access lo an archive of exploils and vulnerable software. Exploils from submillals and mailing lisls collecled inlo one dalabase. Approved
ics.sans.edu (Inlernel Slorm Cenler) Already used by GovCerlUK on a daily basis for limely and relevanl securily news and incidenl reporting. Approved Currenlly updaled on besl endeavours basis
POSITIVE PONY IP address lo company and seclor mapping. See lhe POSITIVE PONY wiki page for more delails. Approved Furlher approvals pending (dev) Currenlly a slalic dala sel
NETPLATE Mulliple dala lypes - delails will be included on lhis page when releasable

(POSITIVE PONY screenshots.)

[edit] Future ones to work on



Available from


From lhe Passive Siginl syslem, or buy from RIRs (Regional Inlernel Regisleries)? Or can we find anolher way of gelling all updales copied lo us? Whal aboul NSA's FOXTRAIL? Or our own GeoFusion? And lhere's now REFRIED CHICKEN
from |RED.ACTED | (”Il's a dalabase of passively inlercepled domain WHOIS records, searchable by any word in lhe record. Since Feb 201 1. There are legal and policy conslrainls which mean you cannol search domains, or lerms wilhin records,
lhal may be sensitive on grounds of localion or nalionalily wilhoul appropriale aulhorisalion. If you would like an accounl please lel me know. Access lo lhe dala relies on having a Global Surge Accounl.”)




maybe an analylic run againsl lhe main DNS records lo find lhe new domains -- or is lhere a more definitive source?
Companies like Cyveillance are able lo oblain feeds of new domain regislralions (for 'brand moniloring', so I imagine

we'd be able lo gel hold of somelhing similar... |REDACTED|@gchq 09:51,7 Seplember 2011 (BST)


morning and

Filtering Volumetries Comments


don't know
ask the

NSA's FOXTRAIL is in lhis space,
and needs more checks lo see
whelher il isn'l suilable. And
GeoFusion (poc: |REDAC TED|).


NSA's FOXTRAIL is in lhis space,
and needs more checks lo see
whelher il isn'l suilable

Site Type of data Legal status
Paslebin An increasing number of lip-offs are coming from lhe Paslebin websile, as lhis is where many hackers anonymously advertise and promole lheir exploils, by publishing slolen information. An aulomaled, regular search (say, weekly) across Paslebin for certain keywords such as .gov.uk or GSI or HMG elc. would be very valuable lo ensure lhal GovCerlUK is always notified if any information lhal lhey need lo be concerned aboul appears in open source. ”30-11-201 1 GovCerlUK briefed aboul an allack on a UN server. This lip came from open source and specifically from Paslebin where lhe slolen emails and passwords had been posled online.” NOT APPROVED: This nalure of lhis sile means lhal il would be very difficull lo demonslrale lhe proportionally of scraping lhe whole sile lo identify lhe small proportion of information lhal would be of value lo CDO and lherefore approval cannol be given for scraping of lhe sile.
OVAL Lisl for NDR lo feed inlo HIDDEN SPOTLIGHT vulnerabilily dalabase APPROVED
Afraid.org |REDACTED|: This lisls domains which are publically available for anyone lo add a sub-domain lo. CDO analysls have suggesled lhal lhis should be anolher resource lhey check alongside whois and roblex when invesligaling a domain.
Joe Slewarl's blog for Dell Secure Works |REDACTED|: lhis regularly includes SNORT rules and olher information lhal can be signalured. APPROVED
scadasec mailing lisl |REDACTED| requesl APPROVED

[edit] Vulnerability Intelligence

Knowledge required Available from

twitter traflic for vulnerabilities use lwiller API in slandard way

certain blogs and CERT web sites
for vuln erabilities

certain CERT IRC chatrooms for


certain CERT email lists for


Commits to open source code
repositories and security patch

Emerging Threats 'Open'

direct web scrape (if allowed).
MHS OSINT pages h ave

direct IRC access (if allowed)
direct reception
GitHub etc.

Scraped via SHORTFALL



by twitter

hourly? malware-w


names ol known

Volumetrics Comments

very small Currenl work is BIRD SEED. JTRIG's BIRDSTRIKE provides lhe scraping already, bul only for handfuls of IDs, and doesn'l repeal. The lweels requires dala mining. Experimenl run by CDT for NDR
(MB) using Cyber Cloud, and has OPP-LEG approval already.

hourly? by list of specific sites/pages

small (GB)

TR-CISA have previously run several conlracls looking al lhis problem, wilh
source information such lhal machines malching lhose rule (vulnerabilities) cai

view lo delivery lo CNE. Final wrap up work is scheduled lo aulomale lhe derivation of SEM rules (see TR-FSP) from open
be found in passive. Wanled by NDR (ref MARBLE POLLS) and GovCERT. See Open source vulnerabilily sources.




by lisl of specific IRCs

by lisl of specific mailing lisls

by specific code projecls,



small (GB)

NB: Assume will include some encrypted IRCs. Wanted by GovCERT. Maybe a MARBLE POLLS source.

NB: Assume will include some encrypted email (including PGP). Wanted by GovCERT. Maybe a MARBLE POLLS source.
Requested by NDR |RED.ACTED|.

Daily? By updaled Snorl rules ??? Approval granled from OP-LEG lo scrape info.

[edit] Bulk Infrastructure Data

Knowledge required



known good lists

known ORB servers

Available from



Filtering Volumetrics


eg, SpamHaus block lisls, DNS block lisls (dnsbl.abuse.ch), DNS blackholing
lisls (malwaredomainlisl.com), Drive-by downloads (blade-defender.org) elc.


eg, Clean MX (supporl.clean-mx.de), and perhaps Google's Safe Browsing API lsiemveesraal
could be used (see blog enlry? imes a


SpamHaus import is already an exploit-level service from ITServices. TR-CISA have just completed an initial study of open sources of this sort of information, with an initial delivery of sample data
small (GB) to CDO. Longer term, we can set up an automated service to fetch this regularly from the Internet, although initially we will use JTRIG infrastructure. Some directly requested by CDO via
|RE-DACTE-D|. ’ ’ ' '

small (GB) Directly requested by CDO via |REDAC TED|

from sources eg, GhostNet



idea from CDO

[edit] Miscellaneous

Knowledge required
UK address to protect

USER_AGENT strings, sources, and expected frequency

Malware development and hacking techniques being
discussed in forums

Retrieved from ”|RED.ACTED|”

Categories: Cyber Defence | Open Source Information

Available from

need lo find oul how we gel lhem al lhe



requires covert monitoring of forums weekly?

Filtering Volumetrics


small (GB)

|REDACTED| apparenlly gol complele lisl of .gov.uk domains via JANET in June 2011. [REDACTED | lrawled KED (and lherefore probably Akamai whois dala) lo find some Lisl X
nelwork info.

small (GB)

see User Agent prototype by |REDACTED|. Of wider interest.

CKX currenlly working wilh E-crime lo identity and evaluale forums of polenlial inleresl. This proiecl may exlend lo active moniloring of and reporting on discussions in selecled forums.
CKX Ops Manager is |REDACTED|.


POC: [REACTED] (mm )



• Discussion

• Edit

• Hislorv

• Dclclc

• Move

• Watch

• Additional Statistics


• Mv lalk

• Mv preferences

• Mv walchlisl

• Mv conlribulions

• Main Page

• Help Pages

• Wikipedia Mirror

• Ask Me Aboul...

• Random page

• Recenl changes

• Report a Problem

• Contacts

• GCWeb

] Q I S^rch


• Whal links here

• Relaled changes

• Upload file

• Special pages

• Prinlable version

• Permanenl link


• This page was lasl modified on 25 June 2012, al 09:42.

• This page has been accessed 640 limes.

• All malerial is UK Ihllp: www gchq organisation ck opensource polic strateg* cop-right Crown Copvrighll © 2008 or is held under licence from lhird parlies. This information is exempt under lhe Freedom of Information Acl 2000 ( FOIAi and mav be exempt under olher UK information legislalion. Refer anv FOIA queries lo
GCHQ on 01242 221491 x30306 or infoleg@gchq.gsi.gov.uk

• Privacv policv

• Aboul GCWiki

• Disclai mers


The maximum classification allowed on GCWiki is TOP SECRET STRAP1 COMINT. Click lo report inappropriate conlenl.


Click to send permalink to address bar, or right-click to copy permalink.

Un-highlight all Un-highlight selectionu Highlight selectionh