Posts

Experiencing Wikidata Query Service

Today, I took the time to explore Wikidata SPARQL. The first query I ran involved locating triples that had an instance of house cat. The query results appeared in a spreadsheet style format below my search. After learning how to perform these queries, I was able to explore Wikidata and craft my very own queries. To start, I did a simple transition from house cats (Q146) to sloth (Q2274076). Since I kept my property the same but changed the value, I received the same type of exhaustive list that house cats produced but for sloths. Feeling semi-confident about venturing into new queries, I performed additional property value searches that would allow me to text my knowledge of the overall process. During one of my explorations, I wanted to explore books that fell within the same series as Harry Potter and the Philosopher’s Stone . At first, I attempted my query with the property as part of the series (P179) with the value as Harry Potter and the Philosopher’s Stone (Q102438). This sea

Working through R: Part. 2

Last night, the amazing Brittani sent me a wonderful tutorial on how to input data on R when using a Mac computer. This was the most beneficial thing I have received for this assignment, as without it I would have been unable to work through part 3 of Clarke’s data exploration tasks. This weekend, I thought I had uploaded the data in its entirety, but I was mistaken. Using the tutorial, I was able to add my data into R and get started on Clarke’s tutorials. With the data in the system, I was able to quickly work through the tutorial and retrieve the intended results. While I did have a few hiccups, Clarke’s tutorial had some slight errors in field names, I was able to resolve the problems and continue through the tutorial. One thing that I noticed during my attempts is that capitalization matters! If you forget to capitalize something it could very well result in an error message. That was a learning experience, but I caught on quickly. Overall, I was able to muddle through the tutoria

Working through R: Part. 1

Over the course of the weekend, I have had some time to explore R and its many parts as I worked through parts 1-3 of our assignment. There were many times during the assignment where I felt like throwing my computer across the room… (I admit, I did toss it across the bed and scream) but I persisted, nonetheless. While I have found R a rather fascinating tool when working with data, it is quite frustrating if you are a novice. Muddling through the instructions, I encountered many issues when it came to installing packages, connecting with ORCID, and lastly performing some of the functions on Clarke’s tutorial. Perhaps, this was in part due to the uploading procedures not being as described in the instructions--that could be a Mac issue? Anyway, after I finally got connected and got the data loaded into R, I was ready to go. It started off great, and I was able to easily explore the dataframes. I was even able to receive the same results as Clarke showed in his tutorial; what a succes

Exploring Chapter 2 of Open Refine Pt. 2

Recipe 4 In recipe 4, I learned how to perform text filters, and just like searching any type of document, it is riddled with problems. While you can certainly perform a text filter, it does not account for the many variations of a word that might have been typed in. In my opinion, this feature is handy if dealing with a dataset that you created or even a dataset with consistent entries (do those even exist?). This reminded me of the find feature in Excel. It is a quick way to retrieve data, but accurate results are not guaranteed. Recipe 5 In recipe 5, I learned more about retrieving records out of a dataset. The most beneficial to me was the ability to edit cells to remove uppercase letters. What I enjoyed most about this feature was the ability to edit multiple cells at once. It is a quick way to make changes to a large set of data. Recipe 6 In recipe 6, I learned how to apply everything that I learned throughout the chapter and make changes to my dataset. While I appreciated lea

Exploring Chapter 2 of Open Refine Pt. 1

I am quick to admit that uploading the GZ file was something new for me, and I struggled with this for over an hour. After testing nearly every option I saw in Open Refine, I finally found where to import a project that has already been “worked on”. When reading the chapter, I do not remember seeing any guidance on how to mess with this data, and I believe clarity with this would go a long way. Alas, it is done so I can begin working on Recipe 1. Recipe 1 In recipe 1, sorting seemed fairly straight forward and somewhat like the sort feature on Excel. The main difference worth noting, something addressed in Chapter 1, was the ability to easily undo and redo the changes that have been made to the data sets. The convenience of sorting and easily restoring the information to its original state makes the sorting feature rather nice. Recipe 2 In recipe 2, I learned how to use facets to retrieve records. This can come in handy when I want to weed out the records with obvious errors or even is

Exploring Open Refine

Chapter 1 Overview in Relation to Excel When using Open Refine, I noticed many similarities in regards to its Excel counterpart. After working my way through the chapter, I can definitely say that I prefer Open Refine’s editing tools over those available in Excel. The ability to undo/redo from any part of the project is more convenient than Excel. In Excel, you would have to constantly save another version outside of the original dataset so that no permanent damage is done to the original data. To record my thoughts throughout chapter 1, I made notes of each recipe as I worked through them. Recipe 1 In recipe 1, I had to install the Open Refine application. For the most part, this was pretty straight forward, but I did have to go in and edit my security settings before it would allow me to complete the download. After doing that, I was ready to go. Recipe 2 In recipe 2, I had to upload a dataset into Open Refine. Once uploading, I was able to play around with the settings to see how i

Spreadsheets for Data Management

When dealing with data, you must be consistent during the data entry process to ensure your records are crafted with accuracy and clarity. Now, in an ideal world, data would be entered in the same format every single time with the same values, but that is not always the case. During the exercises from Tidy Data for Librarians , I was able to see not only how data can be misconstrued but also how it can be analyzed for errors and edited for clarity. What I found most useful from these exercises was the helpful tip of having a notes sheet when working so that you can keep track of what you are doing. In the past, I would have a word document open in hopes of keeping track of my information only to later realize I didn’t create a thorough record of my changes. Overall, these were great exercises that enabled me to refresh my Excel skills, and I am now more comfortable setting data validation, retrieving dates, and using color scales to locate errors. For anyone that needs to learn more of