diff --git a/HOWTO.md b/HOWTO.md new file mode 100644 index 00000000..1229c6b2 --- /dev/null +++ b/HOWTO.md @@ -0,0 +1,61 @@ +How to +====== + +How to feed the AIL framework +----------------------------- + +For the moment, there are three different ways to feed AIL with data: + +1. Be a collaborator of CIRCL and ask to access our feed. It will be sent to the static IP your are using for AIL. + +2. You can setup [pystemon](https://github.com/CIRCL/pystemon) and use the custom feeder provided by AIL (see below). + +3. You can feed your own data using the [./bin/import_dir.py](./bin/import_dir.py) script. + +###Feeding AIL with pystemon +AIL is an analysis tool, not a collector! +However, if you want to collect some pastes and feed them to AIL, the procedure is described below. + +Nevertheless, moderate your queries! + +Feed data to AIL: + +1. Clone the [pystemon's git repository](https://github.com/CIRCL/pystemon) + +2. Install its python dependencies inside your virtual environment + +3. Launch pystemon ``` ./pystemon ``` + +4. Edit your configuration file ```bin/packages/config.cfg``` and modify the pystemonpath path accordingly + +5. Launch pystemon-feeder ``` ./pystemon-feeder.py ``` + + +How to create a new module +-------------------------- + +If you want to add a new processing or analysis module in AIL, follow these simple steps: + +1. Add your module name in [./bin/packages/modules.cfg](./bin/packages/modules.cfg) and subscribe to at least one module at minimum (Usually, Redis_Global). + +2. Use [./bin/template.py](./bin/template.py) as a sample module and create a new file in bin/ with the module name used in the modules.cfg configuration. + + +How to create a new webpage +--------------------------- + +If you want to add a new webpage for a module in AIL, follow these simple steps: + +1. Launch [./var/www/create_new_web_module.py](./var/www/create_new_web_module.py) and enter the name to use for your webpage (Usually, your python module). + +2. A template and flask skeleton has been created for your new webpage in [./var/www/modules/](./var/www/modules/) + +3. Edit the created html files under the template folder as well as the Flask_* python script. + +How to contribute a module +-------------------------- + +Feel free to fork the code, play with it, make some patches or add additional analysis modules. + +To contribute your module, feel free to pull your contribution. + diff --git a/OVERVIEW.md b/OVERVIEW.md new file mode 100644 index 00000000..d852b658 --- /dev/null +++ b/OVERVIEW.md @@ -0,0 +1,15 @@ +Overview +======== + +Redis and LevelDB overview +-------------------------- + +* Redis on TCP port 6379 - DB 0 - Cache hostname/dns +* DB 1 - Paste meta-data +* Redis on TCP port 6380 - Redis Log only +* Redis on TCP port 6381 - DB 0 - PubSub + Queue and Paste content LRU cache + DB 1 - __Mixer__ Cache +* LevelDB on TCP port 6382 - DB 1-4 - Curve, Trending, Terms and Sentiments +* LevelDB on TCP port - DB 0 - Lines duplicate + DB 1 - Hashs + diff --git a/README.md b/README.md index 1e4c7b6d..828f2047 100644 --- a/README.md +++ b/README.md @@ -45,16 +45,19 @@ Features * Modular architecture to handle streams of unstructured or structured information * Default support for external ZMQ feeds, such as provided by CIRCL or other providers +* Multiple feed support * Each module can process and reprocess the information already processed by AIL * Detecting and extracting URLs including their geographical location (e.g. IP address location) -* Extracting and validating potential leak of credit cards numbers +* Extracting and validating potential leak of credit cards numbers, credentials, ... * Extracting and validating email addresses leaked including DNS MX validation * Module for extracting Tor .onion addresses (to be further processed for analysis) +* Keep tracks of duplicates * Extracting and validating potential hostnames (e.g. to feed Passive DNS systems) * A full-text indexer module to index unstructured information -* Modules and web statistics +* Statistics on modules and web +* Realtime modules manager in terminal * Global sentiment analysis for each providers based on nltk vader module -* Terms tracking and occurrence +* Terms, Set of terms and Regex tracking and occurrence * Many more modules for extracting phone numbers, credentials and others Installation @@ -101,69 +104,9 @@ Eventually you can browse the status of the AIL framework website at the followi ``http://localhost:7000/`` -How to -====== -How to feed the AIL framework ------------------------------ - -For the moment, there are two different ways to feed AIL with data: - -1. Be a collaborator of CIRCL and ask to access our feed. It will be sent to the static IP your are using for AIL. - -2. You can setup [pystemon](https://github.com/CIRCL/pystemon) and use the custom feeder provided by AIL (see below). - -###Feeding AIL with pystemon -AIL is an analysis tool, not a collector! -However, if you want to collect some pastes and feed them to AIL, the procedure is described below. - -Nevertheless, moderate your queries! - -Here are the steps to setup pystemon and feed data to AIL: - -1. Clone the [pystemon's git repository](https://github.com/CIRCL/pystemon) - -2. Install its python dependencies inside your virtual environment - -3. Launch pystemon ``` ./pystemon ``` - -4. Edit your configuration file ```bin/packages/config.cfg``` and modify the pystemonpath path accordingly - -5. Launch pystemon-feeder ``` ./pystemon-feeder.py ``` - - -How to create a new module --------------------------- - -If you want to add a new processing or analysis module in AIL, follow these simple steps: - -1. Add your module name in [./bin/packages/modules.cfg](./bin/packages/modules.cfg) and subscribe to the Redis_Global at minimum. - -2. Use [./bin/template.py](./bin/template.py) as a sample module and create a new file in bin/ with the module name used in the modules.cfg configuration. - -How to contribute a module --------------------------- - -Feel free to fork the code, play with it, make some patches or add additional analysis modules. - -To contribute your module, feel free to pull your contribution. - -Overview and License -==================== - - -Redis and LevelDB overview --------------------------- - -* Redis on TCP port 6379 - DB 1 - Paste meta-data -* DB 0 - Cache hostname/dns -* Redis on TCP port 6380 - Redis Pub-Sub only -* Redis on TCP port 6381 - DB 0 - Queue and Paste content LRU cache -* Redis on TCP port 6382 - DB 1-4 - Trending, terms and sentiments -* LevelDB on TCP port - Lines duplicate - -LICENSE -------- +License +======= ``` Copyright (C) 2014 Jules Debra