- Download Wikipedia-Dump file enwiki-*.xml.bz2 (https://dumps.wikimedia.org/enwiki/)
- Convert downloaded archived XML to JSON by executing xmlparse.py script
- after that it could be used for executing any of the script from spark directory
pmfischer/BA_Project
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|