# Scrape Keywords from Indeed.com Job Postings

## Job Posting Crawler

This is code that will pull each job posting for a specific job title in a specific location (or Nationally) and return / plot the percentage of the postings that have certain keywords. The code is set up to search for all words except stopwords, and other user-defined words (there is probably a much more efficient way of doing this, but I had no need to change this once I had the code running). This allows the user to see common technical skills, as well as common soft skills that should be included on a resume.

NOTE: I got this idea from https://jessesw.com/Data-Science-Skills/. Obviously, just using his code would be of no real benefit to me, as I wanted to use the idea to help better my skills with scraping data from HTML files. So, I used his idea and developed my own code from scratch. I also modified the overall process a bit to better fit my needs.

NOTE2: This code will not be able to identify multiple-word skills. So, for example, ‘machine learning’ will show up as either ‘machine’ or ‘learning’. However, ‘machine’ could show up for other phrases than ‘machine learning’.

To run the code, change the city, state, and job title to whichever you wish. After generating the plot, you might need to add ‘keywords’ to the attitional_stop_words list if you do not want them to be included.
As some of you know, I prefer to set up passwordless logins to all of my accounts on remote machines. I recently made a post describing how to enable passwordless SSH to compute nodes, however what if you are attempting to enable passwordless logins to remote machines?

If you are on a Linux machine, or have a copy of the “ssh-copy-id” script on your system then the process is fairly simple.  You must first create the private/public key pairing.  For passwordless SSH, just accept the defaults for each option.

ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/cmaqadj/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:


# How to Fix “Output Conversion Error”

As part of my research for my Ph.D. I am on a team that is currently developing an adjoint of the EPA’s CMAQ air quality model.  In the process of integrating all parts of the model into the full adjoint model, I ran into an error that was rather difficult to resolve.

Running the model would result in many occurances of the following error:

forrtl: error (63): output conversion error, unit -5, file Internal Formatted Write
Image              PC                Routine            Line        Source
libc.so.6          0000003FE9E1D994  Unknown            Unknown     Unknown