David Cruciani
|
f5ac98bcf0
|
add: [crawler] i2p splash crawler
|
2022-06-30 17:33:00 +02:00 |
|
Terrtia
|
b37d05842b
|
fix: [crawler] crawler test: remove print
|
2021-03-30 09:19:13 +02:00 |
|
Terrtia
|
c0be210d2c
|
chg: [crawler] add test + relaunch crawlers + major fixs
|
2021-03-29 20:27:20 +02:00 |
|
Terrtia
|
8754350d39
|
fix: [crawler] user agent + splash restart
|
2021-03-26 11:30:06 +01:00 |
|
Terrtia
|
11d537e2eb
|
chg: [screenshot + har directory] add option to change screenshots directory
|
2021-01-08 17:37:18 +01:00 |
|
Terrtia
|
2c30f1edf9
|
fix: [crawler] fix ResponseNeverReceived retry time
|
2020-09-14 17:13:12 +02:00 |
|
Terrtia
|
abfad61581
|
fix: [crawler] fix ResponseNeverReceived hanlder, check if splash restarted
|
2020-09-14 17:03:36 +02:00 |
|
Terrtia
|
d20ae35548
|
fix: [crawler] option to disable screenshots and har
|
2020-06-04 16:05:32 +02:00 |
|
|
a997e546cb
|
chg: [tor crawler] nyt added
|
2020-05-26 14:43:40 +02:00 |
|
|
6b7514f6f0
|
chg: [black-list onion] keybase added
|
2020-05-23 21:06:41 +02:00 |
|
Terrtia
|
5ad529bc7f
|
fix: [crawler] typo
|
2020-04-06 10:57:51 +02:00 |
|
Terrtia
|
672bb02bbf
|
fix: [Crawler splash ResponseNeverReceived] add retry
|
2020-04-06 10:52:44 +02:00 |
|
Terrtia
|
179fba2ecc
|
fix: [crawler] error catcher
|
2020-04-01 14:58:27 +02:00 |
|
Terrtia
|
f97698ad44
|
Merge branch 'master' into crawler_v2
|
2020-04-01 09:59:33 +02:00 |
|
Terrtia
|
72f1f15659
|
chg: [crawler] edit cookie and cookiejar + add cookie to cookiejar + fix screenshot duplicate
|
2020-04-01 09:58:47 +02:00 |
|
Terrtia
|
5f289f04f3
|
chg: [Crawler core + UI] crawler lua: handle retry + fix cookie loader and selector
|
2020-03-30 18:43:50 +02:00 |
|
Terrtia
|
e3e543ba8b
|
chg: [Crawler] default docker memory usage
|
2020-03-30 09:42:14 +02:00 |
|
Terrtia
|
169c4a8ec7
|
chg: [cookiejar UI] add cookiejar + show all
|
2020-03-27 17:06:26 +01:00 |
|
Terrtia
|
d87ecff4a0
|
chg: [crawler - cookies] add/show/select cookies
|
2020-03-24 17:15:43 +01:00 |
|
Terrtia
|
1c45571042
|
chg: [crawler] add cookies list by user/global, save cookies from file + dict(name, value), TODO: API + handle errors
|
2020-03-23 18:00:09 +01:00 |
|
Terrtia
|
db634e8866
|
fix: [crawler] cleanup
|
2020-03-20 16:20:01 +01:00 |
|
Terrtia
|
6cfd3fe36d
|
chg: [crawler] bypass login: use cookie provided by user and accept cookie from server + refractor
|
2020-03-20 16:15:25 +01:00 |
|
Terrtia
|
42ea678b7a
|
chg: [Splash Crawler] use cookies to bypass login
|
2020-03-09 17:02:18 +01:00 |
|
Terrtia
|
354a4fef7d
|
fix: [Crawler] typo
|
2019-12-19 16:58:36 +01:00 |
|
Terrtia
|
218f1af241
|
fix: [Crawler] fix screenshot-domain typo
|
2019-12-19 08:54:32 +01:00 |
|
Terrtia
|
3d81f30043
|
fix: [Crawler] fix screenshot-domain typo
|
2019-12-19 08:53:55 +01:00 |
|
Terrtia
|
363801fff7
|
fix: [Crawler] fix screenshot-domain map
|
2019-12-19 08:52:02 +01:00 |
|
Terrtia
|
056bad7a49
|
chg: [screenshot correlation + v2.6] add screenshot-domain correlation + v2.6 update
|
2019-12-17 15:13:36 +01:00 |
|
Terrtia
|
bb03ef532b
|
chg: [Correlation UI] add correlation blueprint + UI graph correlation
|
2019-11-14 17:05:58 +01:00 |
|
Terrtia
|
c8d5ce9a28
|
chg: [core] mv bin/packages/config.cfg configs/core.cfg + use ConfigLoader
|
2019-11-05 15:18:03 +01:00 |
|
Terrtia
|
0389b9c23b
|
chg: [crawler] manual/auto crawler: always save screenshots
|
2019-05-13 14:24:16 +02:00 |
|
Terrtia
|
254441f193
|
chg: [crawler] manual/auto crawler: always save screenshots
|
2019-05-13 13:56:43 +02:00 |
|
Terrtia
|
e6dca7f8bf
|
chg: [update v1.5] add background update: screenshots_crawled
|
2019-04-24 16:19:35 +02:00 |
|
Terrtia
|
9868833c77
|
chg: [crawled screenshot] use sha256 as filepath
|
2019-04-24 14:09:04 +02:00 |
|
Terrtia
|
59664efe45
|
Merge branch 'master' into advanced_crawler
|
2019-03-26 16:03:42 +01:00 |
|
Terrtia
|
f64c385343
|
chg: [Crawler] handle port: crawling + history
|
2019-03-22 16:48:07 +01:00 |
|
Terrtia
|
7b32d7f34e
|
chg: [Crawler] major refractor
|
2019-02-25 16:38:50 +01:00 |
|
Terrtia
|
60f7645ac1
|
chg: [Crawler] refractor
|
2019-02-22 17:00:24 +01:00 |
|
Terrtia
|
e5dca268a8
|
chg: [Crawler] refractor
|
2019-02-21 09:54:43 +01:00 |
|
Terrtia
|
b87707e8bc
|
fix: [Crawler] typo
|
2019-02-12 15:54:42 +01:00 |
|
Terrtia
|
37276e52a3
|
fix: [Crawler] typo
|
2019-02-12 15:53:40 +01:00 |
|
Terrtia
|
7cb03fc769
|
fix: [Crawler] typo
|
2019-02-12 15:51:19 +01:00 |
|
Terrtia
|
7a4989ce10
|
fix: [Global Crawler] max filename size
|
2019-02-12 15:45:58 +01:00 |
|
Terrtia
|
516238025f
|
chg: [Crawler] add bootsrap4 src + refractor crawler
|
2019-02-05 17:16:44 +01:00 |
|
Thirion Aurélien
|
44c513dcbb
|
chg: [Crawler] add onion to blacklist
|
2019-01-31 16:56:45 +01:00 |
|
Terrtia
|
92d192238b
|
fix: [Crawler] change max page crawled
|
2019-01-29 17:04:45 +01:00 |
|
Terrtia
|
c1b34bd99c
|
fix: [Crawler] limit max crawled pages
|
2019-01-29 15:38:00 +01:00 |
|
Thirion Aurélien
|
f4ba21e492
|
blacklist onion debian manpages
|
2019-01-14 11:08:53 +01:00 |
|
Terrtia
|
b3b75ccbea
|
fix: [Crawler] Restart Splash on failure, limit unbound in memory cache (maxrss)
|
2019-01-04 15:51:08 +01:00 |
|
Terrtia
|
4d04333f54
|
fix: [Splash server] add debug output
|
2018-12-19 09:30:24 +01:00 |
|