Frequently Asked Questions

🐠 What is this collaboration?

A one-year long research workshop on large language models:

  • conducted online from May ‘21 to May ‘22
  • with a few live sessions/events spread over the year including at least opening and closing events
  • with collaborative tasks aimed at creating, sharing and evaluating a very large multilingual dataset and a very large language model as tools for research

Short name: BigScience

Long name: in the organization working group we are referring to it as the "Summer of Language Models 21 🌸"

🐣 What will be the outcomes?

  • Artifacts created during the collaborative tasks (dataset, model, code tools)
  • Publications in official proceedings (e.g. ACL, PMLR…) along the way
  • Fostering discussion on the research questions related to language models (capabilities, limitations, potential improvements, bias, ethics and social aspects, environmental impact, relation with cognitive neuroscience research) as well as more collaborative way to do large scale research in AI/NLP

🏘 How is this organized?

Generally like a workshop, i.e. with

  • A steering committee (SC) giving scientific and general advice.
  • An organization committee (OC) designing the collaborative tasks as well as organizing the workshop events and public participation. Given the diversity of tasks, the OC is split in Working Groups (WG).
  • Eventually the participation of workshop attendants (the public), in particular in guided aspects of the collaborative tasks and as live event attendance

🗺 How can I participate?

You can join as (see details of the roles in the main document BigScience - Organization):

  • Advisor (Steering Committee member):
    • role: give general scientific/organization advices
    • time commitment: light - reading a newsletter every 2 weeks - giving feedback/advices
  • Participant in a Working Group (Organizing Committee member)
    • role: joining one of the working groups of the OC (see list below): advising/designing/building the collaborative task (building the dataset/model/tools) or advising/designing/organizing the live events
    • time commitment: medium - depend on the chosen task (see details of the working groups below)
  • Chair/co-chair of a Working Group (Organizing Committee member)
    • role: the chair(s) is supposed to provide at least the minimal amount of work necessary for having a very bare-bone version of the task. If other members are active in the WG, the chair(s) can mostly coordinate the effort and organise the decision process.
    • time commitment: more significant - also depend on the chosen WG
  • Possibly as a Workshop attendant joining live events or some public aspect of the collaborative tasks (to be defined by the working groups)
    • role: participating in the collaborative task in a guided way following guidelines setup by the OC (helping build the dataset, helping build the tools)
    • time commitment: free - up to the attendant

🗺 Who can participate?

BigScience is a research workshop and is open to researchers, i.e. people affiliated with a research organization (in academia or industry) and whose day work is for instance (at least in part) to publish papers in peer-review venues, as well as to people whose technical and professional expertise bears relevance to the social aspects of the project.

The organization of the research workshop is not a good place to learn about AI or find mentors as we do not have an internal mentoring program. It’s also not per-se a community for general AI or AGI discussions.

If you are interested in learning about AI/ML or already working on Machine Learning, AI and NLP side projects, note that we plan to organize events for the AI/ML enthusiast community and the general public. You can fill out the following Google Form and join the mailing list to be informed about these upcoming public events.

