Companies and their Relationship to the Environment

Michael Wirtz
4 min readMar 6, 2021


As a continuation to my post from last week (check it out here), I am going to address a company’s relationship to the environment. Arguably the largest ESG-related issue there is to date, the environment represent a pressing issue for many companies. Risks to highly polluting companies or companies that have failed to address the massive call to environmental action are at a disadvantage in the years to come. Research on millennials has shown their strong preference for companies that are leading the charge on this issue. Furthermore, companies may be facing an impossibly high cost of capital if they leave environmental issues by the wayside. For these reason, among countless others, I wanted to try and quantify this relationship.

Quantifying Companies’ Relationships to the Environment

  • Input: list of stock company names
  • Output: same list of companies ranked in order of worst relationships to environment to best relationships to environment.


To quantify this value was a deceivingly difficult task. There are too many things to account for, and I didn’t have the time or resources to address all the concerns that arose. What I did, however, was develop a scoring system that I believe eliminates many of the concerns that were initially present.


The following challenges were the reason for the negative scoring system that I developed below. Here are two sources for further research on the subjects:


Greenwashing is when a company uses public relations to boast about its positive relationship with the environment in an attempt to overshadow the negative aspects of that relationship. For example, a company that still produces coal may run a marketing campaign about their use of solar panels to power the electricity in those plants — ignoring the coal portion of that equation. The US has minimal regulatory requirements for environmental-related disclosures by companies. Therefore, companies are encouraged to only report and share the most positive aspects of their relationships to the environment.

Slack Resources

This is a theory that, when applied to ESG, says this: companies with more cash on hand have an unfair advantage in the race to improve relations with the environment. Cash-heavy companies are likely to seem more ESG positive on paper because of their ability to actually make the upfront investments into ESG projects and initiatives. Therefore, larger and more profitable firms have a huge leg up on the smaller firms that can’t generate the upfront capital to make their operations more green.

Negative Scoring

Because of greenwashing and the slack resources theory, I came up with my own scoring system. Looking at the equation backwards, I hypothesized that judging companies on their negative environmental mentions would put them onto an even playing field.

The Data and Calculations

So how did I put this idea into practice? For now, I made it pretty straightforward. I used the NewsApi to web scrape for a company name and the word “environmental.” From here, I used Textblob to get the sentiment of the article using the built-in polarity score. My final calculation, per company, was simply the sum of all negative polarity articles found for the company in question. Here was the code I used:

def getting_neg_env_data(companies):# Reading in api key
api_key = open('/Users/MichaelWirtz/Desktop/pathfile/newsapi_key.txt').read()
# Inputing api key
newsapi = NewsApiClient(api_key=api_key)
polarity_df = pd.DataFrame(columns=['company','neg_env_count'])for company in companies:
# Get news articles
envi_company = newsapi.get_everything(
q='environment {}'.format(company),
# Empty list for urls
url_list = []
# Appending urls to list
for i in range(0, len(envi_company['articles'])):
url = env_disaster['articles'][i]['url']
# Getting polarity ratings of articles
negative_polarity_count = 0
for url in url_list:
res = requests.get(url)
html_page = res.content
soup = BeautifulSoup(html_page, 'html.parser')
text = soup.find_all(text=True)
output = ''
blacklist = [
for t in text:
if not in blacklist:
output += '{} '.format(t)
output = re.sub(r"[^A-Za-z]+",' ',output)
blob = TextBlob(output)
polarity = blob.polarity
if polarity < 0:
negative_polarity_count += 1
polarity_df = polarity_df.append({'company': company,
'neg_env_count': negative_polarity_count}, ignore_index=True)

return polarity_df


Below is a graph of the first 5 companies in the data frame:

As you can see, when ranked from highest score to lowest, the top figure is 1. There were a few key issues with these findings. I was a bit in the dark on which articles were given negative sentiment and why. The most pressing question is: was the company and the word environment the cause of the negative sentiment in that article? In most cases, probably not.


This project, which you can find the full notebook of here, was riddled with flaws and holes. For anyone reading, I would be very grateful to any suggestions about how to clean it up and make it even a bit more accurate and realistic. My ideas moving forward are to: (1) Automate the process to get a running count on a daily basis, which has the issue of some companies simply receiving far more press; (2) improve the search methodology to make the data acquisition process a bit more precise through the use of more advanced NLP strategies. Hope that this inspires some further work on the subject. Would love to collaborate with anyone interested in the field of sustainable finance and investment.