Thursday, May 15, 2025 From rOpenSci (https://ropensci.org/blog/2025/05/15/puentes-comunidades-campeones-ropensci/). Except where otherwise noted, content on this site is licensed under the CC-BY license.
To be part of the rOpenSci Champions program has been an experience of professional growth and an opportunity to contribute to the rOpenSci community. I learned about R package development while working on a tool to facilitate access to census data from Argentina.
In this blog, I want to share how this experience opened new opportunities, connected me with people and communities, and led me to be part of new projects, strengthening my commitment to open access to data.
From the beginning, my goal was to develop a package in R that would allow structured access to Argentina’s census data. The idea arose from the need to have historical information organized and ready to be used in statistical analysis and research projects. My work as a population statistics analyst led me to identify the importance of having a tool that would facilitate the processing of these data. In historical censuses, these data are scattered in different formats (books, PDFs, spreadsheets and REDATAM), which makes it difficult to access and use.
Collage with the covers of the national censuses of Argentina from 1970 to 2022.
During the program, I worked on the organization and standardization of the data, facing challenges such as the structuring of the information (tidy data) and the creation of efficient functions for its manipulation. One of the most enriching aspects of the program were the training sessions provided by rOpenSci (special thanks to Maëlle and Yani for all the patience and learning), and the continuous discussion of projects with the other Champions in the program.
We participated in virtual trainings on (among other topics) code management, use of efficient packages for the creation of R packages, and automation with GitHub Actions, all with a focus on best practices. In addition, I had the opportunity to benefit from the personalized mentoring of Luis Verde a friend of the LatinR community, who accompanied me throughout the package development process, providing key guidance at each stage.
Illustration by Allison Horst
As I progressed in the development of the package, I faced a key challenge: the structure of the data. As I incorporated information from different census years, I realized that each census brought with it thousands of excel files in different formats making automation difficult and making the transformation one by one, which made the standardization task even more complex. It was then that I decided to invite Emanuel Ciardullo to join the project. We formed a duo from our complementary points of view - me from sociology, him from statistics - to face this first phase of the package. This alliance was key to rethinking the approach and sharing the technical and conceptual work.
My intention was to cover all the census years in the package, but over time I realized that it was going to take more time than I had estimated. It was not only a matter of organizing data, but also of designing a structure that would allow the integration of information in a scalable and reusable way in the future.
With a volume of work already done and a clearer strategy, we reordered the project outline, defined a roadmap for the different phases of data availability, and then were able to focus on finalizing the data we had already structured. Finally, we put together the documentation and developed the first functions: get_census()
, check_repository()
and arcenso()
.
Hex from the arcenso
package, created as part of the rOpenSci champions program.
In the midst of these reflections and technical challenges, ARcenso was born. This project, with the package arcenso, not only seeks to facilitate access to census data, but also to promote its use among researchers, the public sector and citizens in general by means of free software tools. The possibility of contributing with a useful, open and community-oriented tool has been one of the most important motivations of this process.
The project is in its first stage: you can already install the package using remotes
and explore the first census years available, 1970 and 1980. The purpose is to continue development so that ARcenso becomes more robust, undergoes rOpenSci peer review, and eventually become available on CRAN. The Champions Program was the starting point, but development of the package continues because the potential to facilitate access to key data in an open and reusable way is enormous.
I had the opportunity to receive the scholarship to attend Posit::Conf, one of the most important international conferences in the R ecosystem. It was a transformative experience: I learned a lot, met people I admire and experienced firsthand what it means to be part of a global community that is committed to free software, open access and collaborative development. In addition, it was very special to meet in person with part of the rOpenSci team and other people who are part of this community. The exchange helped me to rethink key aspects of ARcenso, from its structure to its potential to attract open collaboration. I came away with new ideas, inspiration and a network of people to continue growing with.
Key moments of the tour: meeting with the rOpenSci community at posit::conf and arcenso presentation at LatinR.
In November 2024, we presented ARcenso at LatinR, the Latin American conference on the use of R in research and development. Together with Emanuel, we shared the work done during the program and how we worked together to build this first phase of the project. It was a very special moment to show the regional community what we had achieved and to receive their feedback and support during the panel on the Package Development Process.
And to close this great 2024, from the organization ‘R in Buenos Aires’, together with R-Ladies Buenos Aires, we gave a local presentation of the package. As part of the R in Buenos Aires organizing team, I coordinated this activity with the aim of sharing the project with the community, showing what we were building and inviting more people to get to know it. This instance was very good, as not only did I receive valuable feedback from the community, but I was also able to connect with other people who had faced similar problems. We also took the opportunity to spread the word about the rOpenSci Champions Program, in the hope that more people in our region will be encouraged to apply for the next cohort.
Presentation in the local chapters of R-Ladies and R in Buenos Aires of the arcenso package: community, functions and the behind the scenes of the work in Dupla.
Presenting the project and discussing it with the community allowed me to reaffirm the importance of creating accessible and well-documented tools. It was also a reminder that we were not alone in this process: the R community is a space where knowledge is built collectively, and actively participating in it was fundamental to move forward with the package.
Participating in the rOpenSci Champions Program was the starting point to create something I didn’t know I could build. It encouraged me to move from using R to developing a package in R, and from the chaos of data to designing a tool meant for other people to work better. But, above all, it connected me with a community that believes in sharing what it knows, in accompanying each other in the process and in opening paths for those who come after. Today ARcenso continues to grow, and so do I: with new ideas, new challenges and the desire to continue building in community and for the community.