Page 12 - ChatGPT Prompts Book: Precision Prompts, Role Prompting, Training & AI Writing Techniques for Mortals
P. 12

dataset contains a lot of errors or inconsistencies, the model
                may learn to produce outputs that are similarly flawed.

                Understanding  the  sources  of  the  dataset  used  to  train  a
                language  model  is  therefore  important  for  assessing  the
                model's  biases  and  reliability.  In  the  case  of  ChatGPT,  the

                model was trained on an extensive selection of data sources
                including the following source types.

                        1) Websites:  Content  from  millions  of  websites,
                            including  news  articles  and  blog  posts  covering  a
                            wide  array  of  topics,  such  as  science,  technology,
                            politics, history, and culture.

                        2) Books: Excerpts from books, both fiction and non-
                            fiction, exposing the model to a variety of different
                            writing styles, genres, and narrative structures.
                        3) Online  forums:  This  includes  content  collected

                            from online forums and discussion boards, such as
                            Reddit and Stack Overflow, providing ChatGPT with
                            examples of informal language and conversation, as
                            well as a variety of opinions and viewpoints.

                        4) Social  media:  Text  from  social  media  platforms,
                            including  Twitter  and  Facebook,  was  used  to  help
                            ChatGPT understand shorter and more casual forms
                            of text, including slang and abbreviations.

                        5) Conversational  data:  Conversational  data  from
                            customer support logs, public chat rooms, and other
                            sources  to  improve  ChatGPT’s  ability  to  engage  in
                            dialogue  and  understand  the  context  in  a

                            conversational setting.

                ChatGPT-4

                At the time of writing, ChatGPT’s model is powered by the
                GPT-4  architecture.  This  model  is  the  latest  version  in  a

                series  of  GPT  models  and  the  culmination  of  decades  of
                research and innovation in large language modeling.
   7   8   9   10   11   12   13   14   15   16   17