What's The Logic?
Something I get asked often is "how can you analyze wolves without using your own observational data collected from wild wolves?". This question is logical. After all, what is science without direct observation? However, my research is based on making models based off data I've put together from many sources of previously collected data. How can I do that? Why can I draw conclusions about wolves and the variables that affect livestock predation when I'm not physically taking data from ranches and forests?
What's a Model?
I don't know, what's a model with you? (Ha. Yes, I paraphrased the joke from Timon in The Lion King)
A statistical model is basically a constructed formula that uses a predictor variable/variables in order to predict the results of these variables on the variable being tested. Remember y = mx + b? Technically, that's a model. You're using an (x) variable multiplied by the slope (m), as well as added to the y-intercept (b) in order to predict the (y) variable. Of course, statistical modeling gets much more complicated than that.
But why use it for science? isn't science all about behavioral observations and drawing conclusions from those? Observational and field data (like scat samples, behavioral assays, and gender ratios) are useful in all fields of biology, but they have their limits... especially when it comes to ecology.
A statistical model is basically a constructed formula that uses a predictor variable/variables in order to predict the results of these variables on the variable being tested. Remember y = mx + b? Technically, that's a model. You're using an (x) variable multiplied by the slope (m), as well as added to the y-intercept (b) in order to predict the (y) variable. Of course, statistical modeling gets much more complicated than that.
But why use it for science? isn't science all about behavioral observations and drawing conclusions from those? Observational and field data (like scat samples, behavioral assays, and gender ratios) are useful in all fields of biology, but they have their limits... especially when it comes to ecology.
Ecology: The science of Uncertainty
"...ecological data are often complex and require complex model structures, and the fitting and interpretation of such models is not always straightforward." (Harrison et. al, 2018)
Broadly, ecology is the study of ecosystems. Ecosystems, broadly, are complex biospheres of similar vegetation and organism that interact with each other. Notice I'm using "broadly" a couple times. That's because not only is the actual field of ecology broad, but many of the topics - population dynamics, human-ecosystem interactions, climate influence, predator-prey relationships- are not easily answered with "raw" data. Why? THERE'S SO MUCH STUFF TO CONSIDER AND YOU CANNOT PHYSICALLY COLLECT ALL THE DATA FOR ALL THIS STUFF IN A TIMELY AND COST-EFFICIENT MANNER.
This is why so many ecologists use statistical modeling. Models offer a way for scientists to take many, many variables and compile them in a statistically accurate way of drawing conclusions and predicting effects. There are hundreds upon hundreds of published papers based solely on statistical modeling and inference (like mine) in topics ranging from the age effects of lion trophy hunting to mosquitos. If you have enough data from previously collected studies or databases and combine them into one huge data set, you can answer questions that haven't been answered before, and do so with limited bias since you're using other, already previously approved-of data. You can do so without the hassle and politics of permits, land ownership, weather, time, rural difficulties, etc. You can even look at past/historical data and use modeling to test an effect or reveal unseen relationships between organisms or biological trends. The possibilities are endless! Models aren't limited to ecology, by the way- they're also used in fields like human medicine, mathematical theory, physics, and so on.
Accurate Populations: Hide & Seek
If you have ever looked for animals in the wild, whether on safari, on a hunt, on data collection, or just for fun, you know that finding your quarry is often easier said then done. In Tanzania, when we worked with the Tarangire Lion Project using actual radio technology on a collared lioness, we spent three hours looking for the pride. And guess what? We still didn't find them. The collared lioness was 15 yards from us somewhere, but was still expert at hiding herself. There were five big trucks filled with professionals and students, and we did not find a collared lioness or her pride.
Yet some people, when questioning my research, say that I can't get accurate results without going out in the field and counting them myself and seeing the environment myself. To a point, this is logical. But in reality, this tactic is impossible in most ecological experiments. It's why state agencies, for example, report state "minimums" of wolves (by the way, for my analysis, I don't use the minimum for the wolf population). There's no way to know the exact number of wolves, even if you're following them on foot 24/7; animals are better at hiding than us. Same for other populations of animals. And that's another reason statistical modeling is great for ecology. Deer have actually been a common subject in the advancement of ecological statistical modeling. So are black bears, lions, and countless other animals.
Yet some people, when questioning my research, say that I can't get accurate results without going out in the field and counting them myself and seeing the environment myself. To a point, this is logical. But in reality, this tactic is impossible in most ecological experiments. It's why state agencies, for example, report state "minimums" of wolves (by the way, for my analysis, I don't use the minimum for the wolf population). There's no way to know the exact number of wolves, even if you're following them on foot 24/7; animals are better at hiding than us. Same for other populations of animals. And that's another reason statistical modeling is great for ecology. Deer have actually been a common subject in the advancement of ecological statistical modeling. So are black bears, lions, and countless other animals.
BIAS
"You should directly be speaking with ranchers and hunters and trappers who actually live with these wolves for your results instead of using this other data, because otherwise your work is flawed." No, I shouldn't. People always think my work is biased, but if you actually want an example of bias, there you go. First of all, it's impossible to interview everyone. But then again, I just said it's impossible to make true population data for animals, so why can't I just model conversations from these groups? Because bias. Agents, biologists, professionals, etc. who collect the data I use do so after training and under a certain code of data collection that at least eliminates most of the bias. Many ranchers and hunters hate wolves. How could I, a scientist wanting unbiased data for unbiased analysis, trust this form of data collection.
And even people who spend all day in the woods and near the wolves cannot offer a perfect idea of population or behavior. You cannot be everywhere at once- especially not in 8 different states over 10 years. By the way, when state Fish and Wildlife/ Fish and Game departments collect data to estimate population numbers, they often use hunter surveys, especially for ungulates.
And even people who spend all day in the woods and near the wolves cannot offer a perfect idea of population or behavior. You cannot be everywhere at once- especially not in 8 different states over 10 years. By the way, when state Fish and Wildlife/ Fish and Game departments collect data to estimate population numbers, they often use hunter surveys, especially for ungulates.
Modeling is Frustrating- Very frustrating
In terms of cost and physical exertion, modeling is fairly easy. But that's about all the "easy" part. The process of modeling off ecological variables is long, tedious, frustrating, and time-consuming. In research like mine, where I gather data from previously collected sources, even the collecting of my big data base is HARD. I have to make sure that my variables are consistent across ten years and eight states; that means sometimes having to do conversions to make sure I can analyze things evenly. An example of this is livestock loss, my main "y" variable, or the thing I'm trying to predict. I actually test two "y" variables- cattle and sheep. But then, I have to decide "Am I counting confirmed loss by wolves? Do I count probable wolf attacks? What about just counting predation "incidents" as opposed to loss count? If I only count confirmed loss, what about the certain losses that aren't recorded? But if I include them all, what about the bias of many wolf-hunting ranchers that might report a wolf attack when it really wasn't so?" After I finally figure out I then have to make sure that the collected data from ALL the states and ALL the years is EXACTLY uniform to my decision.
Then I have to do that about 20+ more times, however many variables I have. And that's only the initial data preparation. Look at this flow chart to see more details of the actual project. When it comes to actual coding in R Studio, I spend a good deal of my time screaming at my computer, because as anyone who uses R, Python, or any coding software for statistics... it's beyond temperamental.
Here's a cool write-up on the statistical modeling process: "The 13 Steps in Statistical Modeling in Any Regression or ANOVA".
Then I have to do that about 20+ more times, however many variables I have. And that's only the initial data preparation. Look at this flow chart to see more details of the actual project. When it comes to actual coding in R Studio, I spend a good deal of my time screaming at my computer, because as anyone who uses R, Python, or any coding software for statistics... it's beyond temperamental.
Here's a cool write-up on the statistical modeling process: "The 13 Steps in Statistical Modeling in Any Regression or ANOVA".
IN CONCLUSION
Although at a glance, it may seem ridiculous to use previously collected data in a series of statistical models in order to draw conclusions about wild wolves, but here's why it works:
- First of all, my analysis runs from 2009-2018 so I obviously have to use past data.
- This way is cost efficient, physically efficient, and in some ways, time efficient.
- Modeling helps account for errors in population counts and other ecological variables.
- Modeling helps analyze data that has never been analyzed - it answers new questions.
- Modeling combines the efforts of many different agencies into one big picture.
- Asking hunters and ranchers is bias heaven and in the science community, ludicrous for sole data collection.
- Ecology is uncertain.
- Modeling helps standardize variables.
- Modeling helps reduce/eliminate bias. You're literally just looking at numbers or binomial factors (so like "yes" or "no").
- MODELING IS SCIENCE AND THAT'S ALL THERE IS TO IT.
Header photo: Paul Nicklen (@paulnicklen on Instagram)