Strength in Numbers

Winter 2022
Illustration by Matt Chinworth

To help their peers ask new questions and make statistics more accessible, two Colgate researchers formed the Data Science Collaboratory. 

It started as a pie-in-the-sky idea, says Will Cipolli, assistant professor of mathematics and the only statistician on campus. What if the other lone statisticians at small colleges and universities across New York state teamed up? They might not be able to field a department softball team, but at least they could build a community. 

And, just as more data points create greater statistical power, the combined strengths of many quantitative researchers enables their peers to ask new and better research questions. That’s what happened when Cipolli’s idea became the Data Science Collaboratory at Colgate University.

The notion arose because Cipolli is frequently called on to collaborate with other faculty members on their quantitative research projects. By working in different fields, “I get to learn a lot, which I really love,” Cipolli says.      

He had many conversations about the need for data analysis support with Joshua Finnell, head of research and instruction in the Colgate University Libraries. A data librarian, Finnell also helps faculty members discover, curate, analyze, preserve, and visualize data for their research projects. As he and Cipolli discussed the need for a more comprehensive approach to data literacy, the idea for a data analysis and collaboration network took shape.          

They applied for, and received, a New Initiatives Grant from the Central New York Library Resources Council in 2018. With that funding, their project was off and running.

Finnell and Cipolli started building a website for the project and recruiting other data scientists from inside and outside Colgate. They call it the Data Science Collaboratory at Colgate University. (The term collaboratory was defined by computer scientist William Wulf as “a center without walls, in which researchers can perform research without regard to physical location … sharing data and computational resources and accessing information in digital libraries.”)

They recruited 10 faculty members from colleges and universities in the New York Six Liberal Arts Consortium. “All of our collaborators bring something unique,” Finnell says. By being part of the network, these researchers are available to consult and work with faculty in need of advanced data analysis or statistical modeling. “While I can help with many statistical techniques, even those beyond my expertise, we can connect researchers with experts in the methods they seek to use,” Cipolli says.

That’s already led him to some interesting new collaborations. For example, he recently worked with sociologists at King’s University College in Canada to analyze how the media has portrayed COVID-19 restrictions, using natural language processing of almost 700 articles. He and other Data Science Collaboratory members have published papers on racial bias, Arctic ecosystems, and boulders on Mars. 

Several student collaborators have also been instrumental to the growth of the Collaboratory as both affiliates and through summer research opportunities.

Current student members are developing statistical applications for the Collaboratory this year.

They’re working on creating point-and-click applications that will live on the website and let users conduct their own statistical analyses. Apps for various statistical methodologies are already up on the website. “The applications make selecting, performing, and interpreting analyses more accessible,” Cipolli says.     

Chau Pham ’22 was one of the first students to work on the project. She says that even though the job was only supposed to be a few hours a week, she couldn’t tear herself away. “I have never felt that passionate about any project before,” Pham says. Chris Cherniakov ’24 is working on developing the apps further, using the template that Pham created. He echoes Pham’s passion for the project. “It’s honestly awesome that we can bring people from different backgrounds to work together and help each other,” Cherniakov says.

The Collaboratory has also become affiliated as a member of the Academic Data Science Alliance, a community network of academic data science leaders, practitioners, and educators; and the Institute for the Quantitative Study of Inclusion, Diversity, and Equity, a research-into-action network applying mathematics, data science, and computation to social justice. 

“The Collaboratory aims to be a place where data science is accessible to everyone,” Cipolli says, “and an integral part of developing the next generation of data scientists at Colgate University.”  

One former Collaboratory member, Jake Scott ’20, went on to work at the Federal Reserve Board; another, Caio Brighenti ’20, is an analyst for the Detroit Lions; and Liam Emmart ’19 is now at the artificial intelligence company BlackBoiler.