Exploring Chapter 2 of Open Refine Pt. 2
Recipe 4
In recipe 4, I learned how to perform text filters, and just like searching any type of document, it is riddled with problems. While you can certainly perform a text filter, it does not account for the many variations of a word that might have been typed in. In my opinion, this feature is handy if dealing with a dataset that you created or even a dataset with consistent entries (do those even exist?). This reminded me of the find feature in Excel. It is a quick way to retrieve data, but accurate results are not guaranteed.
Recipe 5
In recipe 5, I learned more about retrieving records out of a dataset. The most beneficial to me was the ability to edit cells to remove uppercase letters. What I enjoyed most about this feature was the ability to edit multiple cells at once. It is a quick way to make changes to a large set of data.
Recipe 6
In recipe 6, I learned how to apply everything that I learned throughout the chapter and make changes to my dataset. While I appreciated learning the purpose of each and every tool throughout the chapter, the ability to actually make changes in the dataset helped bring what I learned to a head as I could make permanent changes to my dataset. While I know that these changes can easily be undone through the undo/redo tab, it was nice seeing how all of these tools can work together to eliminate duplicate data, mistakes, and empty records that have no place within the entire dataset.
In recipe 4, I learned how to perform text filters, and just like searching any type of document, it is riddled with problems. While you can certainly perform a text filter, it does not account for the many variations of a word that might have been typed in. In my opinion, this feature is handy if dealing with a dataset that you created or even a dataset with consistent entries (do those even exist?). This reminded me of the find feature in Excel. It is a quick way to retrieve data, but accurate results are not guaranteed.
Recipe 5
In recipe 5, I learned more about retrieving records out of a dataset. The most beneficial to me was the ability to edit cells to remove uppercase letters. What I enjoyed most about this feature was the ability to edit multiple cells at once. It is a quick way to make changes to a large set of data.
Recipe 6
In recipe 6, I learned how to apply everything that I learned throughout the chapter and make changes to my dataset. While I appreciated learning the purpose of each and every tool throughout the chapter, the ability to actually make changes in the dataset helped bring what I learned to a head as I could make permanent changes to my dataset. While I know that these changes can easily be undone through the undo/redo tab, it was nice seeing how all of these tools can work together to eliminate duplicate data, mistakes, and empty records that have no place within the entire dataset.
Through the rest of the recipes, I was able to explore more tools within Open Refine. After testing out multiple features, having a recipe like 6 was a great way to put multiple features to work at the same time to achieve some level of data cleaning.
Good work! Keep up those OpenRefine skills and be sure to put it in your CV when you look for positions!
ReplyDeleteDr. MacCall