Page 477 - NGTU_paper_withoutVideo
        P. 477
     Modern Geomatics Technologies and Applications
               management, and so on. In this regard, the review of COVID-19 Tweets Data has also been considered in new
               research. Kabir and Madria analyzed at COVID-19 Tweets data over time to see how themes, subjectivity, and
               human  emotions  changed  over  time.  They  also  make  the  CoronaVis  Twitter  dataset  (focused  on  the  United
               States)  accessible  to  the  research  community  at  https://github.com/mykabir/COVID19.  They  developed  an
               interactive web application for monitoring real-time tweets on COVID-19 and dynamically generating insights.
               They used sentiment analysis and correlated it with trending topics to determine the cause of a sentiment in
               order to gain a deeper understanding of human emotions [15]. Lamsal introduced the COV19Tweets Dataset
               [16],  a  large-scale  Twitter  dataset  of  over  310  million  COVID-19-specific  English  language  tweets  and
               sentiment ratings. The GeoCOV19Tweets Dataset [17] is also presented as a geo version of the dataset. Lamsal
               also addressed the datasets' architecture in detail, as well as the tweets in both datasets. The datasets have been
               made  available  in  the  hopes  of  improving  their understanding  of  the  spatial  and  temporal  aspects  of  public
               discourse around the current pandemic [18]. By analyzing publically available geolocated Twitter social media
               data, Bisanzio et al. were able to predict the spatiotemporal distribution of confirmed COVID-19 cases at the
               global level within the first few weeks of the current outbreak. Their findings show that geolocated Twitter data
               can  be  used  to  characterize  human  mobility  and  the  spread  of  novel  disease  agents  like  SARS-Cov-2.
               Furthermore, after an initial launch has occurred, such a method may be used to predict spread within countries.
               Twitter data may be combined with other data capturing human activity (such as flight traffic, cell phone, and
               census data) to create a global and local  warning system  to increase public  health response times [19]. The
               relationship between COVID-19 data (number of deaths, number of incidents, recovered, and tests) and Twitter
               data  (user's  post,  geographical  location,  and  shared  profile  photo)  was  investigated  in  this  study  using  the
               geographic weighted regression (GWR). The aim of this study was to look into the relationship between spatial
               tweets and COVID-19 data.
               Geographic Weighted Regression (GWR)
               The linear regression technique calculates a parameter that connects the explanatory variables to the response
               variable. When this technique is applied to spatial data, however, some issues regarding the stationarity of these
               parameters over space emerge.
               To identify the nature of relationships between variables, linear regression models the dependent variable y as a
               linear function of explanatory variables x1,..., xp. If you have n observations, the model is written:
                  =    + ∑           +                                 (1)
                                      
                                         
                   
                     0
                             =1
               where β0, β1,..., βp are the parameters and ε 1, ε2,..., εn are the error terms. In this model, the coefficients β k are
               considered  identical  across  the  study  area.  However,  the  hypothesis  of  spatial  uniformity  of  the  effect  of
               explanatory variables on the dependent variable is often unrealistic [20]. If the parameters vary significantly in
               space, a global estimator will hide the geographical richness of the phenomenon.  GWR is a type of model that
               has variable coefficients. The regression coefficients are not constant; they vary according to the geographical
               coordinates of the observations. In other words, the explanatory parameter coefficients form continuous surfaces
               that are evaluated at specific points in space [20, 21].
                  =    (   ,    ) + ∑        (   ,    )   +                                 (2)
                           
                     0
                             
                                        
                                           
                                                   
                                                      
                   
                                              
                                   =1
               Where (ui ,vi) are the geographical coordinates.
               The following hypothesis is used to predict the model: the closer two observations are geographically, the more
               similar the effect of the explanatory variables on the dependent variable, i.e. the closer the coefficients of the
               regression's explanatory parameters. As a result, to predict the model with variable coefficients at point i, the
               fixed-coefficients model was used, and only observations close to i were included in the regression. However,
               the greater the number of points in the sample, the smaller the variance, but the greater the bias. The solution is
               to minimize the value of the most distant observations by assigning a decreasing weight to each observation as
               one gets closer to the point of interest. Output fields of this analysis include StdResid (standardized residual
                                       2
               values), LocalR2 (weighted r  between observed and predicted values), and Predicted (estimated local values)
               [22].





