Extracting data from Wikipedia using Python?

Abhijeet Srivastav
May 4, 2020
1 min read

Python is beautiful and powerful language and we can use to do any kind of task which you can imagine.

In this blog we are going to extract data regarding a keyword from wikipedia.

I need to mention one more thing, we are not going to use web scraping methodology for this purpose and also this going to be a command line interface script.

I will soon upload a GUI app for the same using tkinter module.

Enough talk! Lets get started.

First of all we are going to install a module named wikipedia itself...interesting, what do you say?

pip install wikipedia

Make sure that your python interpreter is added to path in your system.

For Linux system you can use sudo command for the same purpose.

If pip don't work use pip3.

Now open Idle as it is just an small script and type the following code.

import wikipedia
print(wikipedia.summary('Python programming language'))

This gone extract few lines from the top of this wikipedia page.

We can even specify the number of sentences to extract:

print(wikipedia.summary('Python programming language', sentences=2))

We can also get pages related to an keyword:

result = wikipedia.search('Artificial Intelligence')

This is going to return a list of pages associated with the keyword.

In [3]: result = wikipedia.search("Artificial Intelligence")
In [4]: print(result)

['Artificial Intelligence', 'Artificial neural network', , 'Types of artificial intelligence']