Manage your expectations: you’re in for a crazy unstructured ride!
No warranty: use at your own risk!
Some issues are controversial and you might strongly disagree with my opinions
Feel free to (strongly) disagree
Please refrain from insulting my mother
Participants’ rights
Scientific Fraud
Questionable Research Practices
Epistemological values
Questionable Recommended Research
Practices
(Some) Epistemological values
What does it mean to you?
To me, in this context, it means:
Is a matter for another day…
I love philosophy of science but I’ll spare you that lecture…
For today it suffices to say that today Science is done in teams
Increase transparency
Share materials/resources
Preregistration
Data Sharing
Supplementary Materials
Preprints
Web hosting
Version control
SciOps? (git
?,
containers
?, RMarkdown
?,
pandoc
?)
Reason/Logic
Valuing knowledge
Independence
Tolerance (see Santos, Hagá, & Garcia-Marques, in prep)
Fraud
Replication Crisis
Perverse Incentive Structures
Lack of Diversity
Lack of good organizational policies and work ethic
Undocumented knowledge and procedures (makes onboarding hard)
Imposter syndrome
Loneliness
These issues can compromise the error-correction of scientific communites (see Mayo, 1996)
But we must also be careful not to compromise it with our intervention (Garcia-Marques)
Prof. Marcelo Camerlo suggestion of creating an
R
user group stuck with me…
When participating in a WriteOn
group, coordinated by Sara Hagá at FP-UL, I though the model could be
adapted to R
After giving workshops on R
we
started grouping together on RUGGED
We’re on our second season now (see the episode guide)!
If you want to join us follow these simple steps.
You don’t have to speak Portuguese, nor be affiliated with any Portuguese institution
But we do meet on Lisbon/GMT0 working hours…
Scientific results are only published in (expensive) paid journals
Descriptions of methods and analysis are short
No (or very few) supplementary materials
Descriptions of methods and analysis are (mostly) enough to replicate
Open channels for communication
Papers are indexed and easier to search for
Supplementary materials:
Open Access
Preprints
Modern tooling: osf.io, R
/python
,
etc…
Data sharing must be done with respect to participants’ rights:
Consent forms should mention anonymised data might be shared
Data must be anonymised before being shared (error on the safe side)
Triple check the data is anonymised before putting it online
You can make the data anonymisation script public
If you’re sharing your files, try and make sure the names for the columns, variables, files, etc…, are meaningful to others
If you’re going to share your analysis code make it portable:
# This is not portable!
setwd("/home/your_user_name/folders/that/exist/only/on/your/machine/dataset.csv")
# It could be made portable by using a predefined project structure
setwd("./project/stats/analyzes/")
# Importing data through an IDE's interface (e.g., RStudio) is not portable
# If the dataset variable appears without being defined first that's not portable
dataset$Age <- as.numeric(dataset$Age)
If you’re going to share your code try and clean it up:
# Avoid tons of package imports when you only use one
library(afex)
library(dplyr)
library(effectsize)
library(tidyr)
library(WRS2)
# If I only use `dplyr` in that script then I should only have:
library(dplyr)
# If you don't know which packages you use and for what
# you should try and find that out
regression <- lm(DV ~ IV, dataset)
# Same goes for print statements of neeedless information
print(class(dataset$IV)
print(regression)
# You probably only want this last print statement
print(summary(regression))
# Better yet, save that to a file rather than printing it to stdout
sink("../results/regression_summary.txt")
print(summary(regression))
sink()
This might be crazy
No warranty: use at your risk
I had little time to research and cite similar projects:
Revision control with git
and
modern code forges (e.g., gitlab.com,
github.com)
Continuous Integration/Continuous Delivery or Deployment:
DevOps: Keeping track of dependencies and
configurations with containers
TODO lists, notes, discussions are tracked per project
Changes and decisions are linked to discussions, TODOs and notes.
Collaboration can (but needs not) be done in the open
Files are licensed under Open Source licenses
A new level of:
Transparency
Archival
Collaboration
Automation
Level Up Your Open Science Game
Quick (can be done now)
Requiring planning and commitment (can be planned now)
Create an account on osf.io
Create an account on gitlab.com
Contribute to a RUGGED project “help wanted” issues
Fork SciOps and change a file to see what happens
Start drafting a mock preregistration
Check your scripts for non-portable code
Clean up your scripts
Start drafting a longer “Procedure” or “Analysis” for your paper
Start taking notes of your decisions so far
Email your colleagues to start a community
Write down ideas
Make a TODO list
Sort TODOs:
Pick your next task
Choose a commitment device
What do I need to do to:
Make all my materials public?
Use programming for data analysis
Use Open Source software for my research (analysis and experiment running)
All feedback is welcomed!
You can comment on this issue
You can email me any feedback you have
We can stay and chat after the session