
Source (link to git-repo or to original if based on someone elses unmodified work): Add the source-code for this project on opencode.net
This simple programs display the text-only wikipedia compressed dumps, currently available at http://download.wikimedia.org/backup-index.html, generally named something like pages-articles.xml.bz2.
It's fairly useable now although lots of rendering issues occurs
Features includes a Qt viewer with basic text markup, following links, ability to read directly on the .bz2 compressed file (altough some index creations step is needed on first run), tab-like list of articles with load-in-the-background by default, a simple but useful keyword search, very light source-code, optional latex rendering.
The code requires PyQt4
Older versions has been tested on Fedora Core 4 and Kubuntu with PyQt4.1 (Python 2.4, Qt 4.2), and Ubuntu Gutsy.
See included README
Note that the development tree is now hosted on launchpad. See https://launchpad.net/wikipediadumpreader/
Any comment is welcome.
11 years ago
Updated to 0.2.10:
- Use a new indexing scheme for the entrylist - articles load faster now
- Upgrade path for old indexing scheme
- Utf8 fixes for non-ascii pathnames
- experimental RPM package - feedback welcome at the project website : https://launchpad.net/wikipediadumpreader
(jul 09: updated the ubuntu package for Jaunty's Python2.6 compatibility)
Updated to 0.2.9:
- make it able to load Wiktionary non-uppercased words
- Ability to load a 64-bits module - Thanks to Michael Heide
- added a small UI layout - Thanks to GreenReaper
- Better corrupted files handling
Updated to 0.2.8:
- Sorry : no program changes, but a much more friendly opening dialog
Built a rough Ubuntu package, to ease installation for unexperienced users running Ubuntu Gutsy or Hardy
Updated to 0.2.7:
- minor rendering fixes
- a few more macros
Updated to 0.2.6:
- better wikisyntax parsing
- minor bugfixes
Updated to 0.2.5:
- Bugfixes and improvement in rendering.
- Moved the development tree to lp
- optional fontsize
Updated to 0.2.4:
- Optional Latex/texvc call to render math. thanks to Mathieu Beliveau
Updated to 0.2.3:
- Fixed an obvious overflow bug in the index creation code.
Rebuilding the index is necessary, sorry. To force it, delete the two *idx files before running the program, and be patient (English dumps index creation takes several dozen minutes)
- basic table and footnotes support
Updated to 0.2.1 : fix a bug when reading articles on blocks boundaries
Updated to 0.2.2 : improved wiki rendering for lists and definitions
11 years ago
Updated to 0.2.10:
- Use a new indexing scheme for the entrylist - articles load faster now
- Upgrade path for old indexing scheme
- Utf8 fixes for non-ascii pathnames
- experimental RPM package - feedback welcome at the project website : https://launchpad.net/wikipediadumpreader
(jul 09: updated the ubuntu package for Jaunty's Python2.6 compatibility)
Updated to 0.2.9:
- make it able to load Wiktionary non-uppercased words
- Ability to load a 64-bits module - Thanks to Michael Heide
- added a small UI layout - Thanks to GreenReaper
- Better corrupted files handling
Updated to 0.2.8:
- Sorry : no program changes, but a much more friendly opening dialog
Built a rough Ubuntu package, to ease installation for unexperienced users running Ubuntu Gutsy or Hardy
Updated to 0.2.7:
- minor rendering fixes
- a few more macros
Updated to 0.2.6:
- better wikisyntax parsing
- minor bugfixes
Updated to 0.2.5:
- Bugfixes and improvement in rendering.
- Moved the development tree to lp
- optional fontsize
Updated to 0.2.4:
- Optional Latex/texvc call to render math. thanks to Mathieu Beliveau
Updated to 0.2.3:
- Fixed an obvious overflow bug in the index creation code.
Rebuilding the index is necessary, sorry. To force it, delete the two *idx files before running the program, and be patient (English dumps index creation takes several dozen minutes)
- basic table and footnotes support
Updated to 0.2.1 : fix a bug when reading articles on blocks boundaries
Updated to 0.2.2 : improved wiki rendering for lists and definitions
l1zard
9 years ago
Report
REMF
11 years ago
fantastic news if this means what i think it means, i.e. that opensuse/mandriva/fedora users will be able to easily install your fantastic program.
thanks
Report
REMF
12 years ago
a PyQT 4.5 version, or something else........
Report
benji2
11 years ago
Thanks for your support.
Sadly, i have no real plans except of occasional maintenance, or integrating contributor's help.
WikipediaDumpReader should work with any PyQt4 version from 4.1, including 4.5. If you meant "Webkit" version, i don't have any plans for that - at least until 4.5 is default in some LTS release. I take compatibility very seriously as a lot of my users are not bleeding-edge upgraders ;-)
Report
REMF
11 years ago
once again my thanks for creating an awesome program.
do you know if their are any easy to install suse packages available?
Report
tuxpost
12 years ago
Report
orivej
12 years ago
Report
tuxpost
12 years ago
Report
tuxpost
12 years ago
Report
tuxpost
12 years ago
Report
orivej
12 years ago
Report
REMF
12 years ago
I note you mention something about ubuntu packages, is there any chance you could provide the same convenience for opensuse?
Mant thanks
Report
sinosure
12 years ago
It seemed that maemo don't have pyqt4 :(
Report
REMF
12 years ago
is there any further news on what will happen next with this excellent program?
forgive my ignorance, but will it work on KDE4, specifically Opensuse 11.1 using KDE 4.1.2?
cheers
Report
benji2
12 years ago
Wikipedia Dump Reader doesn't use any "KDE" features, only PyQt4. Therefore, it should work the same either on KDE 3, 4, or any non-KDE-based environment, as long as PyQt4 is installed.
Regarding future development, i don't have clear plans currently, as it already does what i intended it to do (+ i'm lazy).
Do you think some major feature is missing for a convenient use ? Maybe the suggested cleaning of non-reachable links should be on my todo list...
Report
REMF
12 years ago
many thanks
Report
applegrew
13 years ago
Another note: When I start dumpReader.py
I get the following errors in the console.
dumpReader.py:11: RuntimeWarning: Python C API version mismatch for module bz2: This Python has API version 1013, module bz2 has version 1012.
import bz2
Error while loading math parser
-----------
I have 2.5.1 running in Kubuntu Gutsy Gibbon.
Report
benji2
12 years ago
Thanks for the report. I first need to get a fresher english dump to trigger the bug, hope to have time to fix it soon.
Regarding the python error, it's pretty safe to ignore it. If it bothers you, see included README on why it does and how to fix.
Report
REMF
13 years ago
Regards
Report
benji2
12 years ago
Sorry for the delay. As I didn't have much time to work on it, i only did minor updates. I guess i may occasionally hack on it, but not very actively. I moved the (source) code to the launchpad code hosting for people who may be interested.
Report
slyfoot
12 years ago
Report
benji2
12 years ago
I just uploaded version 0.2.5, which ease fontsize changing. From the README:
Q. Can i change the text size ?
A. Font Size can now be changed, altough you will have to manually modify
the program : Edit the "dumpReader.py" file, go to the line which says
"fontSize = 9" and change "9" to whatever point size fits you best.
This will only change the font size of the text area.
Note that i don't put any "preferences" dialog in the application itself, as i don't feel it's yet needed.
Regards,
Report
REMF
13 years ago
Report
andrewmin
13 years ago
Other than that, great job!
Report
benji2
13 years ago
Regarding your suggestion, it would be great - but it's indeed not possible to do, because there is no way to "update" the already downloaded dump. The only way to get more up-to-date wikipedia data is to delete the old dump (including indexes files) and fully download a new one.
Therefore, it's pointless to do that automatically. On the other hand, i'll add a few lines in the README explaining exactly that, so the user is not confused when he wants fresher data.
Report