Commit graph

165 commits

Author SHA1 Message Date
Mokaddem
14e9850dd6 Added new module for Duplicate paste. Seems working but has some small bug (re-check same paste twice) 2016-07-15 16:58:48 +02:00
Mokaddem
2383db022f Added default configuration 2016-07-15 09:10:44 +02:00
Mokaddem
9a9e07f600 Added default configuration 2016-07-15 09:08:38 +02:00
Mokaddem
0332f23579 Added SimHash library 2016-07-15 08:56:16 +02:00
Mokaddem
fba14bfb4b In index: Added number of processed pastes chart 2016-07-14 11:16:07 +02:00
Mokaddem
ab61e32399 Commented out get_language because it adds too much overhead 2016-07-14 11:15:15 +02:00
Mokaddem
594d2def35 In index: Added number of processed pastes chart 2016-07-13 15:57:33 +02:00
Mokaddem
56b6659d8b Commented out get_language because it adds too much overhead 2016-07-13 08:59:48 +02:00
Mokaddem
c51bdec8aa Merge branch 'mokaddem-testing' 2016-07-12 11:53:24 +02:00
Mokaddem
8a1247cf5d modified variable name str 2016-07-12 11:52:19 +02:00
Mokaddem
7e5ce0f17f Optimized create_plot and removed test commemts 2016-07-12 11:47:51 +02:00
Mokaddem
465244e1ce Added dynamic table sorting in search page. (Still need to add dependencies) 2016-07-07 16:38:00 +02:00
Mokaddem
7ff9b9a583 Added DomainTrending seems working.
Started search features with related html pages, not finish yet.
2016-07-05 16:53:03 +02:00
Mokaddem
5a9eca9291 Added few comments 2016-07-04 09:18:23 +02:00
Mokaddem
4b3101b7b6 Added template tld. Modified URL using Faup and refactored WebStats. 2016-07-01 16:59:08 +02:00
Mokaddem
beeeb76de9 Added new modules and started WebTrending web interface 2016-06-30 14:38:28 +02:00
Mokaddem
3dc356dc5e Getting Started: Initial configuration working on the laptop 2016-06-30 14:36:47 +02:00
MaximeStor
ab66cd255a Improve SourceCode, keywords and add description in /doc 2016-03-12 12:30:38 +01:00
MaximeStor
701d771aa5 Add first version of Source Code 2016-03-12 11:21:29 +01:00
Raphaël Vinot
be86737ca7 Because 0MQ fails. 2016-03-11 16:16:53 +01:00
Raphaël Vinot
f6e4ea2270 Fix logging, fix URL regex 2016-02-11 12:19:03 +01:00
Raphaël Vinot
d160e4a2c8 Add Credential in the scripts to launch 2016-02-10 17:33:16 +01:00
Raphaël Vinot
90e1b25426 Split filepath and count in credential module 2016-02-10 17:31:52 +01:00
Raphaël Vinot
4895ee9fa2 Add new category (Credential) 2016-02-10 16:39:56 +01:00
Raphaël Vinot
1da8675750 Refactoring on Credential, Phone and Release 2016-02-10 16:39:06 +01:00
c68136b04b Merge branch 'pgp' of https://github.com/Rafiot/AIL-framework
Conflicts:
	bin/packages/modules.cfg
2016-02-08 10:13:44 +01:00
MaximeStor
b7d2b64a86 Merge branch 'master' of https://github.com/CIRCL/AIL-framework into module 2016-02-06 11:28:48 +01:00
192ee7f0ec Merge pull request #49 from Alainfou/master
Phone module added (regex needs optimization)
2016-02-06 11:20:46 +01:00
Alain
ea52fd1068 Phone regex updated
Might still need to be fixed / optimized, in case of maths or random numbers starting with a 0. Do not capture dates, hours, coordinates anymore. Captured formats are: e.g. +331234567890 ; 09 12 34 56 78 ; +4177/123.45.69 ; +352(0)6-23-23-23...
2016-02-05 20:58:02 +01:00
Alain
fabbfd8ae9 Update module.cfg (adding Keys and Phone section) 2016-02-05 14:00:41 -05:00
Alain
43b3556588 Starting Phone number recognition 2016-02-05 13:58:21 -05:00
MaximeStor
07513a5b37 Add modules Credential and Release 2016-02-05 16:15:09 +01:00
Raphaël Vinot
9171d5b118 Add module to find PGP encrypted blobs 2016-02-05 16:03:37 +01:00
Raphaël Vinot
aef8ab0411 Listen locally for 0MQ 2016-02-04 15:32:50 +01:00
Raphaël Vinot
5ca13c42eb Launch redis and leveldb from local directory 2016-02-04 15:24:39 +01:00
Raphaël Vinot
12aca6b760 Add script to import from local directory, use local python from env 2016-02-04 15:22:51 +01:00
Raphaël Vinot
315cb48117 Add template file for writing a new module 2016-02-03 10:33:42 +01:00
Raphaël Vinot
0d6adc2063 Add initial Travis file 2016-01-19 11:43:34 +01:00
cdd0725e88 -v option added to list the path 2015-12-22 21:37:05 +00:00
e3971ac93a Onion fetching loop deactivated by default 2014-12-22 16:06:38 +00:00
Raphaël Vinot
08ceefc375 Re-add config option 2014-12-22 16:50:25 +01:00
Raphaël Vinot
50369c6706 Revert changes on the config file due to merging messup 2014-12-22 16:29:05 +01:00
Raphaël Vinot
f717f9fe89 Merge branch 'master' of github.com:CIRCL/AIL-framework 2014-12-22 15:32:48 +01:00
Raphaël Vinot
9ee61db2cf Add hotfixes 2014-12-22 15:27:02 +01:00
Raphaël Vinot
8803c8447a Publish the fetched onions on a ZMQ feed. 2014-09-30 16:55:16 +02:00
25757b0fff A simple feeder script feeding data from pystemon to AIL.
The configuration matches the default Redis parameters used
in the pystemon configuration.

https://github.com/cvandeplas/pystemon/blob/master/pystemon.yaml#L16
2014-09-19 14:03:05 +02:00
Raphaël Vinot
65b9a01644 Add config file for DomainClassifier, proper reporting 2014-09-17 17:22:56 +02:00
27b134ec03 Add proper publisher for classified domains/hostnames 2014-09-10 09:27:47 +02:00
Raphaël Vinot
f017680365 fix onions, cc and domain classifier modules 2014-09-08 16:51:43 +02:00
de6e21d5a7 DomainClassifier sample configuration added 2014-09-08 16:44:05 +02:00
246621f663 First version of the DomainClassifier 2014-09-08 16:43:21 +02:00
1397db9691 Global queue for DomainClassifier 2014-09-08 11:07:45 +02:00
Raphaël Vinot
e983c839ad Categ now listen to the Global queue 2014-09-05 17:05:45 +02:00
Raphaël Vinot
46f27ada4e More cleanup 2014-09-05 10:42:01 +02:00
Raphaël Vinot
fca00beed9 Add Domain Classifier module.
Cleanup in the config files.
2014-09-05 10:41:00 +02:00
Raphaël Vinot
b7c9e489c9 Fix the exceptions 2014-09-04 11:46:07 +02:00
Raphaël Vinot
9e8611a42d stop killing the disk when creating the word curve 2014-09-02 18:20:28 +02:00
Raphaël Vinot
7542eaf739 Update starting script. 2014-09-02 15:21:36 +02:00
Raphaël Vinot
0c6b09f379 Fix the onion module, log the valid onions. 2014-09-01 16:18:06 +02:00
Raphaël Vinot
f4b89669fc The onion module now fetches the URLs it finds. 2014-08-31 22:42:12 +02:00
Raphaël Vinot
abfe13436b Big refactoring, make the queues more flexible 2014-08-29 19:37:56 +02:00
Raphaël Vinot
623e876f3b Cleanup.
* Remove useless subscriber
* Fix typo in the config file
* Update Helper accordingly
2014-08-26 17:36:57 +02:00
3b499a2ec8 ZMQ Publisher removed
ZMQ Publisher removed to allow concurrent use of the scripts.
In short term, we would replace all publishing part within AIL
into pub-sub Redis to avoid ZMQ limitation.
2014-08-26 14:38:49 +02:00
f070ac2005 cymruwhois uses dotted decimal format 2014-08-25 10:05:36 +02:00
Raphaël Vinot
3886d1b834 Small fixes to make the refactoring production ready
* the port for the logging is 6380
* use os.environ properly
* fix typos
2014-08-22 17:35:40 +02:00
Raphaël Vinot
78125db4ea Use env variables everywhere 2014-08-22 14:52:02 +02:00
Raphaël Vinot
277d138a5d cleanup, add FIXME 2014-08-21 14:39:17 +02:00
Raphaël Vinot
63b29176c1 move Redis_Data_Merging to Paste 2014-08-21 12:22:07 +02:00
Raphaël Vinot
50cfac857e Update config
Make all paths in the config file relative to the home directory.
2014-08-20 16:00:56 +02:00
Raphaël Vinot
a68f5b6a0e fix subscriber names, update default config 2014-08-20 15:54:21 +02:00
Raphaël Vinot
2485ba5df2 Merge remote-tracking branch 'origin/master' into testing
Conflicts:
	bin/ZMQ_Sub_Urls.py
2014-08-20 15:24:10 +02:00
Raphaël Vinot
99c8cc7941 completely remove ZMQ_PubSub.py 2014-08-20 15:14:57 +02:00
1d64dc44c8 MIME type guessing - removed one duplicate call to libmagic 2014-08-20 10:22:33 +02:00
Raphaël Vinot
8d9ffbaa53 Do not create a ZMQ sub if it is not required. 2014-08-19 19:53:33 +02:00
Raphaël Vinot
45b0bf3983 Improve the cleanup. Still some to do. 2014-08-19 19:07:07 +02:00
Raphaël Vinot
f1753d67c6 Cleanup the queues. 2014-08-19 16:05:37 +02:00
e8fcea6cd6 Remove undeclared variable 2014-08-18 16:17:36 +02:00
7d8ee102a3 Assignment before use (if Enumerate fails) 2014-08-18 15:58:06 +02:00
4304c6858e Configuration path fixed 2014-08-18 09:02:08 +02:00
Raphaël Vinot
078c8ea836 Big cleanup, pep8 2014-08-14 18:07:18 +02:00
Jules
ab6765315e Merge pull request #13 from adulau/master
Log where URLs are hosted - cc_critical option added
2014-08-14 14:28:01 +02:00
762def3a23 Log where URLs are hosted - cc_critical option added
It logs where the hostname of the URL is hosted (ASN and geographic location).
A simple option cc_critical added to set the country code to log as critical.
2014-08-14 14:22:11 +02:00
Raphaël Vinot
4a1f300a1a Cleanup (remove unused imports, more pep8 compatible) 2014-08-14 14:11:07 +02:00
Starow
04a8f1bdf2 maxi cleanup old code :'( 2014-08-14 11:48:46 +02:00
Starow
29b24b6466 printing set of domain for debugging 2014-08-13 16:35:27 +02:00
Raphaël Vinot
ece3bc173e Cleanup of main Paste module 2014-08-13 11:56:22 +02:00
Raphaël Vinot
5b17d416c8 remove script installed by pubsublogger 2014-08-13 11:55:59 +02:00
Raphaël Vinot
935e51c961 Remove 3rd party code (pubsublogger), add it in the deps. 2014-08-13 10:19:43 +02:00
Starow
37033ca3a6 Minor logs modifications 2014-08-13 10:08:44 +02:00
Starow
6aa4d7cb7d Harmonising logs messages + Changing some dygraph options 2014-08-12 15:42:16 +02:00
0b4a80b7ea -s option added to find similar documents
By default, the index is not storing the vector of the document (Whoosh
document schema). It won't work if you don't change the schema of the
index for the content. It depends of your storage strategy.
2014-08-12 13:42:26 +02:00
fd6e1a8436 -f option added: dump full document for each match 2014-08-12 13:26:56 +02:00
0a6664ffba Indexer: Some index statistics added
usage: indexer_lookup.py [-h] [-q Q] [-n] [-t] [-l]

Fulltext search for AIL

optional arguments:
  -h, --help  show this help message and exit
  -q Q        query to lookup (one or more)
  -n          return number of indexed documents
  -t          dump top 500 terms
  -l          dump all terms encountered in indexed documents
2014-08-11 15:07:12 +02:00
f65a94d47b -l added -> dumping all terms indexed 2014-08-11 14:56:15 +02:00
f3d1ca052e Return the number of indexed documents 2014-08-11 14:50:35 +02:00
611d2a466f Configuration that should not be there... 2014-08-11 14:24:27 +02:00
2b8f2689bf Indexer queue and script added to "BBS-like" LAUNCH script 2014-08-11 14:06:52 +02:00
9657c6bf80 Merge branch 'master' of https://github.com/CIRCL/AIL-framework 2014-08-11 13:46:37 +02:00
b1053af3cd Indexer module: script to query the index
Test script to query the index generated from the Indexer module.

python indexer_lookup.py -q Visa -q Mastercard
2014-08-11 12:03:27 +02:00
Starow
079db6f80c Hardcoded path from ZMQ_Curve are now referring correctly in config.cfg.sample fix #6 2014-08-11 11:33:18 +02:00