Minion Word Clouds?? Python Word Clouds made easy!

Hey everyone!

As I’ve been preparing for different Python talks about data science techniques I came across this cool feature you can do with Word Clouds in Python!  Word clouds can be very useful when showing what the most important sentiments of a group of people are. They also, well, look pretty!

I played around with the wordcloud generator in python and came up with some cool images!

In order to get started you must import several packages. The easiest way for me was to use pip:

  • Pillow
    • This is one of pythons imaging library. It is a forked version of PIL and is now the only one kept up to date. IMPORTANT: If you already have PIL installed, you will have to uninstall it before installing Pillow.
  • Numpy
    • Numpy is an extremely important package for scientific computing. It is useful for creating N dimensional arrays, linear algebra, Fourier transform, and has random number capabilities.
  • matplotlib
    • Matplotlib is a 2d plotting library and another important tool that is useful in scientific computing and data science.
  • wordcloud
    • This is word cloud generator that sets up the word cloud in a specific shape and then fills in words based on the frequency of different words. The larger the word, the more often it occurs!
After installing all of the dependencies, I worked with the masked version in order to add my own images. The one’s supplied in the examples are Alice in Wonderland and the StormTrooper Mask.
alice_mask stormtrooper_mask

These both ran extremely smoothly! I didn’t like how some of the plots came out though, so I added some of my own stopwords. Stopwords are words that you want to generator to ignore. For instance, words that occur often in the English language such as “the”, “and”, “a”, “like”, etc., are not words we want to see in a word cloud.
The wordcloud package already has a set of its own stopwords, but sometimes you need to add more! In my case, I was looking at interview transcripts and the words “interviewer” and “interviewee” came up too many times. So I decided to add them from the stopwords list. To do so you can just do add STOPWORDS.add(“interviewer”) to your code before the wordcloud is generated.
Other than changing the file locations for text and images, I really didn’t have to change anything else! These are my final outputs for Alice and the Stormtrooper mask! Apparently, the people that I interviewed really like to “think” and also had “time” come up quite a bit!
alicestormtrooper

In order to use your own image, you need to find an image that is filled in with black. If you don’t, only the outline will fill in with words. and then you can’t really see the image shape. Now, I really like minions, so I tried to different images to see how they would turn out. A minion might not be the best shape, but in the second one you can definitely see a better outline!

minion_whiteminions

 

minionFilledminions2

 

I also did it with a puppy, and this one came out great! As you can see, the same words came out the largest in all of them!

puppyHappy word clouding! 🙂 🙂

 

 

 

 

 

 

 

 

 

 

2 thoughts on “Minion Word Clouds?? Python Word Clouds made easy!

  1. Pingback: microheather | Predicting Privilege with Python? Do Millennials Make the Grade?

  2. Pingback: microheather | Predicting Privilege with Python. Do Millennials Make the Grade?

Comments are closed.