Warren Buffett Shareholder Letters: Sentiment Analysis in R


Warren Buffett — known as the “Oracle of Omaha” — is one of the most successful investors of all time. Wherever the winds of the market may blow, he always seems to find a way to deliver impressive returns for his investors and his company, Berkshire Hathaway. Every year he authors his famous “shareholder letter” with his musing about the market and investment strategy and — perhaps as reflects his continued success — this sentiment analysis of his letters by data scientist Michael Toth shows that the tone has been generally positive over time. Only five of the forty years of letters show an average negative sentiment: those correspond to market downturns in 1987, 1990, 2001/2002 and 2008.

Michael used the R language to generate a sentiment score for each letter, and the process was surprisingly simple (you can find the R code here). The letters are published as PDF documents, from which the text can be extracted using the pdf_text function in the pdftools package. Then you can use the tidytext package to decompose the letters into individual words, whose Bing sentiment score can be calculated using its get_sentiments function. From there, a simple ggplot2 bar chart is used to show the average sentiment scores for each letter.

For more on the sentiment of Warren Buffett’s shareholder letters, including an analysis of the most-used positive and negative words, follow the link to the complete blog post below.

Michael Toth: Sentiment Analysis of Warren Buffett’s Letters to Shareholders


Source link

Previous When Lighting Plays Smart- VLC & Smart Lighting TechNews
Next Data Pipelines in Hadoop - Silicon Valley Data Science