In 1935, the mathematical world took an epic step forward with the publication of The Design of Experiments by the British statistician Ronald Fisher. Until such time, the world seemed too big to make reasonable assumptions of fact. How could you know how many people lived in a country if you couldn’t count every single person? How could you know how much coal a mine shaft could yield if you hadn’t smashed through all the rocks? Questions like these formed the foundation of the now universally accepted concepts of statistical sampling and related inferences that could be drawn for the total population. This giant step forward meant we could design tests to make reasonable estimates about the whole while only examining one of its parts.
Since 1935, countless contributions in the fields of statistics, computer science, mathematics, physical sciences, and medicine (among others) have dramatically improved our ability to make statistically valid inferences based on an examination of samples. But at its core, the world of Fishersonian statistics (often called “frequentist statistics”), has been predicated on the simple premise of estimation from pieces of the whole pie. Read More
My statistics professors were fond of saying: garbage in, garbage out. There are a lot of areas in life this applies: health, speech, statistics, etc. We all know that when we eat a lot of food that is not healthy, our bodies start to reflect an unhealthy state. When we hear a lot of swearing or crude language, it comes into our minds much more quickly. When we have data that does not meet the assumptions a particular type of analysis requires, we have outcomes that may or may not be accurate—but either way it’s a risk to trust them.
Another potential pitfall is miscommunication of assumptions. If, as an analyst, my assumptions are different from the business owner who will be making decisions off of the data, this can result in bad decisions. Read More
Think of all the decisions we make in a day. For me, it starts with my alarm going off at 5:30 and the decision to get up or sleep longer. Assuming I choose to get up, the next decision is whether I run on the treadmill or do some other type of exercise. And the list of decisions goes on and on. Each decision in life is based on two things: data and assumptions.
We take in data all of the time. When I wake up, I use my senses to take in data that answer all sorts of questions: what time is it? Has the sun come up? Am I still tired? What day of the week is it? How do I feel? We also make assumptions all of the time: if it is a weekday, I assume I need to go to work; if I choose to get up, I assume that the benefits of getting up and exercising outweigh the benefits of continuing to sleep; if I choose to run on the treadmill, I assume that the treadmill will work; I assume that when I turn the light on the room will be illuminated. We then repeat this process with every decision we make throughout our day. Read More
Pretend you are data mining, you find something and you are super excited about it. Then you present it to the business and they could care less—or worse they shut you down completely. Similarly, have you ever received an assignment and, once you completed it and gave it to them, they said, “What I really want is…”
Both of these examples are showing the same thing: communication problems. I can hear what you are saying: “I am an analyst, all I do is run an analysis and send it out. The numbers speak for themselves.” In college, my statistics professors used to tell us that the best thing we could do is learn how to communicate well with others. They used to joke that they were going to start requiring communications courses as part of the stats major. I have finally come to understand that they are not wrong. Read More
Greetings fine peoples of the Internets. Brevity is the key to this post. So here we go. Tracking stuff is good…no duh, right? Everyone knows that. The same way everyone knows oranges taste good (and yes, Ms. Benes, they’re delicious). It’s just a basic assumption. It’s what we do. We track things. Tracking stuff is important. No but seriously, tracking stuff is important.
I love stories. I love reading. I love movies. I love TV. It gives me a chance to walk in someone else’s shoes for a few minutes, hours, or days. It’s awesome. It sparks the creative part in my brain, and I get to ask what I would do in that situation. I’ve even been trying to write a novel. I haven’t given up, but I haven’t made very much progress recently. I have thought about “why” I am not making much progress. The answer I have come up with, and I’m a little ashamed to admit this, is that I have not been putting in the time to hone my skills in the basics and have been trying to advance my novel too quickly.
There are times to forge ahead and times to hone the basic skills. Even at times when you have moved past the basics, it is beneficial to go back and work on the basics. This is important with any discipline—and statistics is no different.
Moving day! Friday at 5pm our Facilities department started moving the entire marketing department from the second floor down to the first. Today has been the first day in our new digs. It’s very different. It’s an open configuration. With convertible desks, we have the option to stand or sit. Also, one of our development teams is sitting with us. It’s fun to now be with the entire marketing team. It encourages the creative juices to flow. All in all, it’s a big change.
One of the potential pitfalls in the analytics world is getting bogged down with requests and reporting. It’s easy to become a reporting monkey. Once in that role, it’s hard to get out of it. But it is possible. Read More
I was in a conversation a couple of months ago about building a new report for some of the marketing folks to use. One of the limitations of the program being used to build the report is limited ability to format. Now that we had the specifics of what was wanted, I suggested we talk about how to format the report before we built it. I was shot down because “we can worry about formatting later.”
I can see people reading this blog rolling their eyes and saying “you really wanted to talk formatting that early in the process?” Yes! Formatting matters. Read More
Whether or not we like to admit it, building and distributing reports is a function of any analytics team. Some days it feels like all I am doing is reporting. Other days I get to dive into the data and search for interesting findings. Reports are used in all types of disciplines: Doctors use blood reports to determine if a person has deficiencies in vitamins, anti-bodies, and other blood chemistries; Engineers use reports to determine what materials are needed to build things; Retailers use reports to update inventories, determine conversion statistics, improve processes, and a plethora of other things.
In my personal life, I reconcile my bank statement (a report) with my receipts to make sure they match. Reports are everywhere. Even people who think “I never do anything with numbers or reports,” actually, well – do. Reports are generated from time cards, from your activity on the internet, from going to the grocery store, everything.
There are a few things that I have been thinking about lately:
- The 2.5 quintillion bytes of data that are created everyday
- Fraud, identity theft, hackers, viruses, other not fun things
I am sitting here in the office, typing this blog—the author of a small amount of the 2.5 quintillion bytes of data that will be created today. The immense volume of data created every day is awesome and frightening. To a certain degree, we have all reached the status of celebrity: everything we do and say can be recorded and used for us or against us. However, not all of us lead interesting enough lives to have our every movement blasted all over television, internet, and social media. But still, teams of analysts are analyzing what is happening—usually at an aggregate level. Then algorithms are built to impact us on an individual level. Feeling paranoid yet? If not, look at all of the points of contact in your life that are recorded: Read More