But with multistream, it is possible to get an article from the archive without unpacking the whole thing. So if you unpack either, you get the same data. 2 and 2 both contain the same xml contents. GET THE MULTISTREAM VERSION! (and the corresponding index file, 2) To download a subset of the database in XML format, such as a specific category or a list of articles see: Special:Export, usage of which is described at Help:Export.Go to Latest Dumps and look out for all the files that have 'pages-meta-history' in their name. Please only download these if you know you can cope with this quantity of data. All revisions, all pages: These files expand to multiple terabytes of text.SQL files for the pages and links are also available.all-titles-in-ns0.gz – Article titles only (with redirects).2 – Current revisions only, all pages (including talk).2 – Current revisions only, no talk or user pages this is probably what you want, and is over 19 GB compressed (expands to over 86 GB when decompressed).Download the data dump using a BitTorrent client (torrenting has many benefits and reduces server load, saving bandwidth costs).English Wikipedia dumps in SQL and XML: dumps.
Dumps from any Wikimedia Foundation project: dumps.Where do I get the dumps? English-language Wikipedia