Posted in code on July 2nd, 2010 by Michael Ewens – 1 Comment
If you have a long R script and would like to be notified when computation is complete, follow these directions to get Growl notifications:
- Install Growl and leave the disk image open.
- Open Terminal.
- Run “cd /Volumes/Growl-1.2/Extras/growlnotify”
- Run “./install.sh”
Then follow these directions to get your R script to talk to Growl by adding something like:
system(paste("growlnotify -a R -t \"R is done\" -m", "\"Inserted the data\"", sep=""))
Posted in code on May 23rd, 2010 by Michael Ewens – 2 Comments
Stata has a ton of flexilbility for creating and manipulating dates. However, if you want to save Stata data to an external database (e.g. Accesss, PostgreSQL, MySQL, etc.) the numeric date format in Stata will be difficult to interpret outside the program. My use case involved working with Stata to merge and clean some data that was pushed to a MySQL database with odbc and later loaded in R with its odbc functionality. It was in the last step that I learned of Stata’s dating conventions when writing to external databases. So I asked the Stata list.
A very helpful subsciber presented a solution very similar to one that I mocked up. With some of the code posted on the Statalist and some new additions, I present odbc2create. UPDATE: I fixed an issue when dealing with a database with no dates and had to add a loop.
This modified odbc command does the following:
- searches all your variables for dates (they must be formatted as such or Stata’s ability to detect them is impossible)
- converts those dates to the YYYY-MM-DD format
- inserts the dates into your mysql database as strings
- re-types those date columns in the newly created database as DATEs
The best part: when you load a table created this way back into Stata, it immediately recognized the DATEs as dates. I hope they build this functionality into Stata in the future. One caveat (which may explain why they haven’t built it internally) is that the ALTER command in the ado file is specific to MySQL. Someone should generalize the code to recognize the datasource engine and modify the ALTER command accordingly.
Posted in code on April 9th, 2010 by Michael Ewens – 1 Comment
Posted in code on February 21st, 2010 by Michael Ewens – 2 Comments
Perhaps it is bad that I didn’t know this before, but the following code for Stata would have saved a week off of my dissertation work. Suppose that you have data structured like so:
firm_id,date,amount
and you want to create a new variable that is the total amount as of each date for each firm. In Stata, you simply type:
sort firm_id date
bysort firm_id: gen total_t = sum(amount)
Note the use of ‘gen‘ rather than ‘egen.’ The ‘sum’ command differs by the type of generate command (i.e. gen or egen), so about 500 lines of loops written in Stata code could be condensed in a few lines. Stata needs to fix the ‘egen’ and ‘gen’ distinction or I need to port more of my projects to R.
Posted in economics, visualization on October 31st, 2009 by Michael Ewens – 2 Comments
The AEA’s JOE postings present the near-population of jobs available for newly-minted economic PhDs. I used the XML data available for download to create a mash-up of job locations on Google Maps. I break the posting down into US full-time academic, international full-time academic and non-academic. Here is how I create the maps: 
- Select the subset of the data you want (e.g. US academic) and download the XML file.
- Fix some validation errors: take out the “<” and “>” within the text of nodes (I use TextMate for this).
- Parse the XML file with a custom PHP script that creates a csv file with school, position, location and url to posting. Here is my simple script for the academic XML file.
- Save the csv file produced by the script in step 3 as an Excel spreadsheet (Google Docs doesn’t like csv’s). Add a “Latitude” and “Longitude” column to the spreadsheet.
- Upload the Excel file to Google docs.
- Follow these directions to populate the latitude and longitude of each position+location.
- Publish the Google spreadsheet and save the unique id in the url that Google gives you.
- Sign up for a Google Maps API account.
- Follow these directions to produce a Google map of your postings.
UPDATE: This service may make this process a bit easier, produce cleaner maps and allow the incorporation of more information.
Maybe the AEA can follow these directions to produce these maps after this year. Contact me with any suggestions or questions.
Posted in code on September 15th, 2009 by Michael Ewens – 2 Comments
Matlab has some great plotting tools, but the output of export of ”Save As” rarely produces consistent and clean results. Enter export_fig. The best feature is its anti-aliasing that produces clean, crisp fonts. Just download the package and add
add_path('export_fig')
to your m file and you can use the function. I had the best luck (see example below) with the follow command:
export_fig('figures/updates/risk_over_size.png', '-png', '-nocrop');

Posted in code on September 2nd, 2009 by Michael Ewens – Be the first to comment
Suppose you want to create a dynamic matrix of strings in Matlab. For example, you might want the legend of your graph to depend on the data (which changes on a daily basis). Cell arrays are your best bet. However, be warned on how to access the elements of said arrays. Suppose I have a cell array constructed as follows:
names = cell(3,2);
names(high_regime,: ) = [{'Probability of a home run'} {'Home Run'}];
names(low_regime, : ) = [{'Probability of bankruptcy'} {'Bankruptcy'}];
names(middle_regime, : ) = [{'Probability break-even'} {'Break-even'}];
If you want to access a particular element of this cell array as a string, you must use the curly brackets like so:
set(plot1(1),'LineStyle','-.','DisplayName',names{1,2});
If you try the standard names(1,2), the function set() will not think the result is a string.
Posted in Uncategorized on July 23rd, 2009 by Michael Ewens – 1 Comment
If you change your datasets or code a lot in Matlab, it is smart to keep track of the results over time (trust me….). The ‘diary’ function allows you to record all output of your scripts to a file. Append your main .m file with the following code to create a diary/log file that is uniquely dated to the time (to the minute) that you ran the script:
date_now = clock;
date_now = strcat(num2str(date_now(1)),'_',num2str(date_now(2)),'_', num2str(date_now(3)), num2str(date_now(4)), num2str(date_now(5)));
diary(strcat('log', date_now,'.log'));
Posted in Uncategorized on June 21st, 2009 by Michael Ewens – Be the first to comment
The myriad of Google searches did not help me set a new PATH variable on my Mac. Here is how I did it:
1. cd to /etc
2. Edit the file ‘profile’
3. Append the path you want to the end of the /usr/local … line.
Posted in Uncategorized on June 9th, 2009 by Michael Ewens – Be the first to comment
What I learned today
Make sure that you set your column collations to latin1_swedish_c if you want to load a mysql table not created by Stata.