Exploring Chapter 2 of Open Refine Pt. 1
I am quick to admit that uploading the GZ file was something new for me, and I struggled with this for over an hour. After testing nearly every option I saw in Open Refine, I finally found where to import a project that has already been “worked on”. When reading the chapter, I do not remember seeing any guidance on how to mess with this data, and I believe clarity with this would go a long way. Alas, it is done so I can begin working on Recipe 1.
Recipe 1
In recipe 1, sorting seemed fairly straight forward and somewhat like the sort feature on Excel. The main difference worth noting, something addressed in Chapter 1, was the ability to easily undo and redo the changes that have been made to the data sets. The convenience of sorting and easily restoring the information to its original state makes the sorting feature rather nice.
Recipe 2
In recipe 2, I learned how to use facets to retrieve records. This can come in handy when I want to weed out the records with obvious errors or even isolate a specific range of records. Similar to what I learned in Chapter 1, I have found that the most beneficial tool in this recipe is the ability to flag multiple entries at once and even filter records to show just the flagged entries.
Recipe 3
In recipe 3, I learned how to extrapolate duplicate records using tiered facets. This was a rather lengthy process, but it was quicker than going through all of the records manually. It was a little confusing during my first attempt, but I later realized that the confusion stemmed from me not clearing the facets from recipe 2. Once I reset my data, I was able to work through the recipe with ease.
In these three recipes, everything seemed pretty straight forward, and having the screenshot images in the text was extremely helpful as it helped me confirm that what I had done was correct.
Comments
Post a Comment