Effects of grazing on Swedish meadows

Author

Bastien, Elia, Caroline, Alexia, Marine, Lily, Julie

INTRODUCTION

Terrestrial ecosystems were dominated for several million years by large herbivorous mammals, which then declined at the end of the Pleistocene under the pressure of the first prehistoric human populations. The loss of these large herbivore species (aurochs, wild horses and anes, woolly rhinoceros…) was also synonymous with the loss of the ecosystem services they provided. (lundgren2020human?)

For several years, the “rewilding” movement has been aiming to restore these ecosystem services by reintroducing species into large European natural areas, and the present study seeks to assess the extent to which introduced herbivore species have an influence on vegetation. (garido2019experimental?)

MATERIALS AND METHODS

The study site is located approximately 17 km south of Uppsala, Sweden. On this study site, three plots were used. Each plot was divided in two, with an ungrazed part and a grazed part. Four young Gotland stallions were placed in each grazing area for three years.

Numerous variables were studied: - plant height at maturity - seed mass - species richness - plot area occupied by anemogamous plants - plot area occupied by entomogamous plants - RCS score *. Other information was noted at each survey, such as the season, year and observer who carried out the inventory. Information on the insects found on the plot was also recorded, but will not be used in our analysis.

The RCS score corresponds to an index of the strategy used by the plants. (Wood2018microbiomes?)

RESEARCH QUESTIONS

We have chosen to use the (garido2019experimental?) dataset to answer three different questions. Firstly, does grazing have an effect on the plant species richness of plots? Secondly, does grazing have an effect on plant morphology? And finally, does grazing have an effect on plant pollination strategies?

To answer these questions, we transformed the initial dataset to obtain a dataset of 404 records. The variables used are : - treatment (grazed or ungrazed) - season - year - cover.area.bee (plot area occupied by entomogamous plants) - cover.area.gene (plot area occupied by anemogamous plants) - logh (log of plant height) - logS (log of seed mass) - richness (species richness) - observer - R.score - S.score - C.score - ratio poll (cover.area.bee/cover.area.gene)

PACKAGES IMPORT

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.0     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(FactoMineR)
library(dplyr)
library(ggplot2)

DATA IMPORT

df <- read.table('pollen.csv', sep = ';',stringsAsFactors=TRUE, header = TRUE)
df$year <- as.factor(df$year)
summary (df)

    treatment      season      year     cover.area.bee   cover.are.gene  
 grazed  :207   autumn:152   2014:101   Min.   : 0.000   Min.   : 0.250  
 ungrazed:197   spring: 99   2015:151   1st Qu.: 0.500   1st Qu.: 1.750  
                summer:153   2016:152   Median : 2.000   Median : 3.500  
                                        Mean   : 3.115   Mean   : 4.696  
                                        3rd Qu.: 5.000   3rd Qu.: 6.000  
                                        Max.   :22.000   Max.   :23.750  
 logHeight.score   logSeedMass.score  num.species     observer
 Min.   :-5.8146   Min.   :-4.0858   Min.   : 2.000   A:101   
 1st Qu.:-1.8925   1st Qu.:-0.9571   1st Qu.: 6.000   B:303   
 Median :-1.4330   Median :-0.2923   Median : 8.000           
 Mean   :-1.1248   Mean   :-0.3030   Mean   : 7.993           
 3rd Qu.:-0.9145   3rd Qu.: 0.4340   3rd Qu.:10.000           
 Max.   : 3.8725   Max.   : 1.8636   Max.   :15.000           
    R.score           C.score            S.score          RatioPoll      
 Min.   :0.00000   Min.   :0.006173   Min.   :0.00000   Min.   : 0.0000  
 1st Qu.:0.08291   1st Qu.:0.160511   1st Qu.:0.06604   1st Qu.: 0.1409  
 Median :0.14196   Median :0.245438   Median :0.10698   Median : 0.5420  
 Mean   :0.14566   Mean   :0.268780   Mean   :0.11690   Mean   : 1.3659  
 3rd Qu.:0.19711   3rd Qu.:0.328159   3rd Qu.:0.15181   3rd Qu.: 1.3988  
 Max.   :0.35867   Max.   :0.965243   Max.   :0.36787   Max.   :32.0000

PART 1: EFFECT OF GRAZING ON SPECIES RICHNESS

1.Data visualization

ggplot(data=df)+
  geom_boxplot(aes(x=treatment, y=num.species))+
  geom_jitter(aes(x=treatment, y=num.species)) + 
  labs(x="Treatment", y="Species richness")

Fig 1 : Effect of grassland on species richness

The specific richness of grazed areas appears to be greater than in ungrazed areas @ref(fig:fig1).

hyp1<-lm(num.species~1,data=df)
hyp0<-lm(num.species~treatment,data=df)
anova(hyp0,hyp1)

Analysis of Variance Table

Model 1: num.species ~ treatment
Model 2: num.species ~ 1
  Res.Df    RSS Df Sum of Sq     F    Pr(>F)    
1    402 2280.1                                 
2    403 2417.0 -1   -136.87 24.13 1.312e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The p-value is less than 0.05, so grazing has a significant effect on specific richness.

2.Choice of model: in this section, we assume that grazing (treatment) depends on the season and year of observation

modComplet<-LinearModel(num.species~(treatment+season+year)*(treatment),data=df,selection = "aic")
modComplet

Results for the complete model:
==============================
Call:
LinearModel(formula = num.species ~ (treatment + season + year) * 
    (treatment), data = df, selection = "aic")

Residual standard error: 2.201 on 394 degrees of freedom
Multiple R-squared:  0.2103 
F-statistic: 11.66 on 9 and 394 DF,  p-value: 2.732e-16 
AIC = 647.3    BIC = 687.3

Results for the model selected by AIC criterion:
===============================================
Call:
LinearModel(formula = num.species ~ treatment + season + year + 
    treatment:year, data = df, selection = "aic")

Residual standard error: 2.196 on 396 degrees of freedom
Multiple R-squared:   0.21 
F-statistic: 15.03 on 7 and 396 DF,  p-value: 1.95e-17 
AIC = 643.5    BIC = 675.5

Ftest
                    SS  df      MS F value    Pr(>F)
treatment       107.32   1 107.315  22.255 3.315e-06
season          194.60   2  97.300  20.178 4.518e-09
year            133.15   2  66.575  13.806 1.599e-06
treatment:year   41.47   2  20.734   4.300   0.01421
Residuals      1909.51 396   4.822                  

Ttest
                                    Estimate Std. Error t value  Pr(>|t|)
(Intercept)                         8.109457   0.115553 70.1793 < 2.2e-16
treatment - grazed                  0.525042   0.111295  4.7176 3.315e-06
treatment - ungrazed               -0.525042   0.111295 -4.7176 3.315e-06
season - autumn                    -0.857605   0.154433 -5.5532 5.150e-08
season - spring                     0.127684   0.179319  0.7120 0.4768541
season - summer                     0.729922   0.154433  4.7265 3.180e-06
year - 2014                         0.924700   0.178175  5.1898 3.370e-07
year - 2015                        -0.358379   0.154465 -2.3201 0.0208409
year - 2016                        -0.566321   0.154469 -3.6662 0.0002797
treatment - grazed : year - 2014   -0.456895   0.168272 -2.7152 0.0069130
treatment - ungrazed : year - 2014  0.456895   0.168272  2.7152 0.0069130
treatment - grazed : year - 2015    0.089175   0.151774  0.5876 0.5571677
treatment - ungrazed : year - 2015 -0.089175   0.151774 -0.5876 0.5571677
treatment - grazed : year - 2016    0.367720   0.151555  2.4263 0.0156989
treatment - ungrazed : year - 2016 -0.367720   0.151555 -2.4263 0.0156989

The results for the full model are as follows: p-value: 2.554e-16, AIC = 649, BIC = 681. However, the model chosen according to the AIC criterion shows the following results: p-value: 1.41e-17, AIC = 645.2 and BIC = 669.2. As the AIC criterion is smaller for this model than for the full model, we choose to retain: num.species ~ treatment + season + year + treatment:year. There is therefore an interaction between the pasture and its year of observation.

3.Validation of the chosen model with the Bartlett test

modelAIC<-lm(num.species~treatment+season+year+treatment:year,data=df)

res<-residuals(modelAIC)
boxplot(res~treatment,data=df)

bartlett.test(res~treatment,data=df)


    Bartlett test of homogeneity of variances

data:  res by treatment
Bartlett's K-squared = 1.9684, df = 1, p-value = 0.1606

boxplot(res~season,data=df)

bartlett.test(res~season,data=df)


    Bartlett test of homogeneity of variances

data:  res by season
Bartlett's K-squared = 4.3645, df = 2, p-value = 0.1128

boxplot(res~year,data=df)

bartlett.test(res~year,data=df)


    Bartlett test of homogeneity of variances

data:  res by year
Bartlett's K-squared = 3.7679, df = 2, p-value = 0.152

boxplot(res~treatment:year,data=df)

df$interaction <- interaction(df$treatment, df$year)
bartlett.test(res ~ interaction, data = df)


    Bartlett test of homogeneity of variances

data:  res by interaction
Bartlett's K-squared = 6.3, df = 5, p-value = 0.2781

For the Bartlett tests, the p-values are all greater than 0.05, which means that homoscedasticity is respected for all our parameters, or that the variances are all equal for all our observations @ref(fig:fig2).

4. Residual normality test with Shapiro

hist(res,main="Histogramme residus",xlab="Residus")

shapiro.test(res)


    Shapiro-Wilk normality test

data:  res
W = 0.99555, p-value = 0.3061

Residual normality is respected, as our p-value is greater than 0.05: the data are compatible with a normal distribution. Our linear model is therefore valid, as residual normality is a key assumption for applying this model @ref(fig:fig3).

Conclusion: Species richness therefore depends on grazing, season and year, as well as on the interaction grazing:year.

PART 2 - EFFECT OF GRAZING ON THE MORPHOLOGY

In this part, we’ll try to analyse the effect of treatment (grazing), a qualitative variable, on the morphological parameters of the plants (height and seed mass), quantitative variables.

1. Effect of grazing on the plant height

Potential relationship between Height and effect of treatment :

Let’s start by representing plant size as a function of the applied treatment (grazing or no grazing) to have an idea of the possible effect.

boxplot(df$logHeight.score~df$treatment, varwidth = TRUE, ylab = "plant height (log)", xlab = "Treatment", col='darkgrey', main = "", title = "effect of grazing on the height")

ggplot(data=df)+
  geom_boxplot(aes(x=treatment, y=logHeight.score))+
  geom_jitter(aes(x=treatment, y=logHeight.score))+
  ylab("Plant height (log)") +
  xlab("Treatment")

Interpretation :

The median for the no grazing treatment is slightly higher than for the grazing treatment @ref(fig:fig4). The data also seems slightly more dispersed and spread out. So we could say that the plants without grazing are potentially higher than the ones which had the treatment.

Statistical analysis of the relationship between Height and effect of treatment :

We saw earlier that there could be a potential effect of treatment on the plant’s height, we now want to know if this effect is really significant.

We perform an ANOVA to compare a random model with our model.

Significance test: Hypothesis H0: Grazing has no effect on plant height. Hypothesis H1: Grazing has an effect on plant height.

The p-value is 2.39e-4 < 0.05. The test is significant, so hypothesis H0 is rejected. Grazing significantly reduces plant size.

2. Effect of grazing on the seed mass

Now, let’s study the seed mass of plants in relation to grazing.

Potential relationship between seed mass and effect of treatment :

Let’s start by representing the seed mass of plants as a function of the applied treatment (grazing or no grazing) to have an idea of the possible effect.

boxplot(df$logSeedMass.score~df$treatment, varwidth = TRUE, ylab = "Seed mass (log)", xlab = "Treatment", col='darkgrey', main = "", title = "effect of grazing on seed mass")

ggplot(data=df)+
  geom_boxplot(aes(x=treatment, y=logSeedMass.score))+
  geom_jitter(aes(x=treatment, y=logSeedMass.score))+
  labs(y = "Seed mass (log)", x = "Treatment")

Interpretation :

The medians are almost identical @ref(fig:fig5). The data for the no grazing treatment also seems more dispersed and spread out. So we could say that there is potentially no effect of treatment on seed mass.

Statistical analysis of the relationship between Height and effect of treatment :

We saw earlier that there could be no potential effect of treatment on seed mass. We now want to know if there is no significant difference.

We perform an ANOVA to compare the model where we only consider the effect of grazing with a model where the effect of grazing is random.

Significance test: Hypothesis H0: Grazing has no effect on seed mass. Hypothesis H1: Grazing has an effect on seed mass.

hyp2<-lm(logSeedMass.score~1,data=df)
hyp3<-lm(logSeedMass.score~treatment,data=df)
anova(hyp2,hyp3)

Analysis of Variance Table

Model 1: logSeedMass.score ~ 1
Model 2: logSeedMass.score ~ treatment
  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1    403 457.54                           
2    402 456.58  1   0.96445 0.8492 0.3573

The p-value = 0.3573 > 0.05, so the seed mass under hypothesis H1 is not significantly different from that under hypothesis H0. We can conclude that there is no effect of grazing on seed mass.

3. Is the height of plants only affected by grazing ?

We will now analyze the effect of grazing on plant size more precisely. We want to know if the size is influenced by grazing (treatment) alone or by its interaction with other factors.

For this analysis, we will test the effects of grazing and its interaction with other factors on plant size through model analysis using the BIC criterion.

hyp4<-LinearModel(logHeight.score~(treatment+season+year+observer+logSeedMass.score+RatioPoll+R.score+C.score+S.score)*
                    (treatment+season+year+observer+logSeedMass.score+RatioPoll+R.score+C.score+S.score), data=df, selection = "bic")
hyp4

Results for the complete model:
==============================
Call:
LinearModel(formula = logHeight.score ~ (treatment + season + 
    year + observer + logSeedMass.score + RatioPoll + R.score + 
    C.score + S.score) * (treatment + season + year + observer + 
    logSeedMass.score + RatioPoll + R.score + C.score + S.score), 
    data = df, selection = "bic")

Residual standard error: 0.6471 on 351 degrees of freedom
Multiple R-squared:  0.8309 
F-statistic: 33.16 on 52 and 351 DF,  p-value: 1.753e-106 
AIC = -302.5    BIC = -90.45

Results for the model selected by BIC criterion:
===============================================
Call:
LinearModel(formula = logHeight.score ~ treatment + logSeedMass.score + 
    RatioPoll + R.score + C.score + S.score + treatment:logSeedMass.score + 
    treatment:RatioPoll + logSeedMass.score:R.score + logSeedMass.score:S.score + 
    RatioPoll:C.score + R.score:C.score + C.score:S.score, data = df, 
    selection = "bic")

Residual standard error: 0.6667 on 390 degrees of freedom
Multiple R-squared:  0.8005 
F-statistic: 120.4 on 13 and 390 DF,  p-value: 1.368e-127 
AIC = -313.8    BIC = -257.8

Ftest
                                 SS  df     MS  F value    Pr(>F)
treatment                     0.689   1  0.689   1.5498 0.2139129
logSeedMass.score             4.090   1  4.090   9.2027 0.0025783
RatioPoll                     2.833   1  2.833   6.3736 0.0119787
R.score                       0.024   1  0.024   0.0550 0.8147233
C.score                      81.942   1 81.942 184.3546 < 2.2e-16
S.score                       0.018   1  0.018   0.0397 0.8420779
treatment:logSeedMass.score   4.390   1  4.390   9.8773 0.0018013
treatment:RatioPoll           3.009   1  3.009   6.7695 0.0096260
logSeedMass.score:R.score     6.250   1  6.250  14.0607 0.0002038
logSeedMass.score:S.score    11.658   1 11.658  26.2274 4.779e-07
RatioPoll:C.score             3.110   1  3.110   6.9962 0.0084984
R.score:C.score               4.854   1  4.854  10.9205 0.0010390
C.score:S.score               4.599   1  4.599  10.3465 0.0014055
Residuals                   173.347 390  0.444                   

Ttest
                                           Estimate Std. Error  t value
(Intercept)                               -2.771317   0.181401 -15.2773
treatment - grazed                        -0.057289   0.046019  -1.2449
treatment - ungrazed                       0.057289   0.046019   1.2449
logSeedMass.score                          0.238069   0.078477   3.0336
RatioPoll                                 -0.073921   0.029280  -2.5246
R.score                                    0.338994   1.445631   0.2345
C.score                                    6.611139   0.486911  13.5777
S.score                                   -0.272817   1.368404  -0.1994
treatment - grazed : logSeedMass.score    -0.118967   0.037854  -3.1428
treatment - ungrazed : logSeedMass.score   0.118967   0.037854   3.1428
treatment - grazed : RatioPoll             0.042710   0.016415   2.6018
treatment - ungrazed : RatioPoll          -0.042710   0.016415  -2.6018
logSeedMass.score : R.score                2.521365   0.672407   3.7498
logSeedMass.score : S.score               -3.791413   0.740326  -5.1213
RatioPoll : C.score                        0.221980   0.083923   2.6450
R.score : C.score                        -13.670288   4.136717  -3.3046
C.score : S.score                         13.387882   4.162125   3.2166
                                          Pr(>|t|)
(Intercept)                              < 2.2e-16
treatment - grazed                       0.2139129
treatment - ungrazed                     0.2139129
logSeedMass.score                        0.0025783
RatioPoll                                0.0119787
R.score                                  0.8147233
C.score                                  < 2.2e-16
S.score                                  0.8420779
treatment - grazed : logSeedMass.score   0.0018013
treatment - ungrazed : logSeedMass.score 0.0018013
treatment - grazed : RatioPoll           0.0096260
treatment - ungrazed : RatioPoll         0.0096260
logSeedMass.score : R.score              0.0002038
logSeedMass.score : S.score              4.779e-07
RatioPoll : C.score                      0.0084984
R.score : C.score                        0.0010390
C.score : S.score                        0.0014055

After the analysis, the final model with the lower BIC is : logH ~ treatment + logS + RatioPoll + R.score + C.score + S.score + treatment:logS + treatment:RatioPoll + logS:R.score + logS:S.score + RatioPoll:C.score + R.score:C.score + C.score:S.score

The results show that the factor treatment (grazing) by itself has no significant effect (p-value = 0.21 > 0.05) on the height of plants and same goes for R.score and S.score. This means that the effect of treatment that we previously saw was mainly due to interactions between treatment and other factors.

PARTIE 3: GRAZING EFFECT ON POLLINISATION STRATEGIES

1. Visualization of data and one-factor analysis of variance

# Boxplot 
par(mfrow=c(1,2))


boxplot(df$RatioPoll~df$treatment, varwidth = TRUE, ylab = "Ratio Pollinisation", xlab = "treatment", col='blue', main = "")

The boxplot @ref(fig:fig6) shows no significant differences between plots grazed and those not grazed.

# Anova
poll1<-lm(RatioPoll~1,data=df)
poll2<-lm(RatioPoll~treatment,data=df)
anova(poll1,poll2)

Analysis of Variance Table

Model 1: RatioPoll ~ 1
Model 2: RatioPoll ~ treatment
  Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
1    403 2956.5                              
2    402 2931.7  1    24.786 3.3987 0.06598 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The P-value is not significant at the 5% threshold, so we reject the hypothesis that grazing has an effect on the RatioPoll. The Anova test confirms our graphical observation: there is no difference between the grazed and natural plots.

2.Model test

poll3<-LinearModel(RatioPoll~(treatment+season+year+observer+R.score+S.score+C.score)*
                     (treatment+season+year+observer+R.score+S.score+C.score),data=df,selection = "aic")
poll3

Results for the complete model:
==============================
Call:
LinearModel(formula = RatioPoll ~ (treatment + season + year + 
    observer + R.score + S.score + C.score) * (treatment + season + 
    year + observer + R.score + S.score + C.score), data = df, 
    selection = "aic")

Residual standard error: 2.562 on 370 degrees of freedom
Multiple R-squared:  0.1787 
F-statistic: 2.439 on 33 and 370 DF,  p-value: 3.229e-05 
AIC = 792.6    BIC = 928.6

Results for the model selected by AIC criterion:
===============================================
Call:
LinearModel(formula = RatioPoll ~ treatment + season + year + 
    R.score + S.score + C.score + treatment:R.score + treatment:C.score + 
    season:R.score + season:S.score + year:R.score + year:C.score + 
    R.score:C.score + S.score:C.score, data = df, selection = "aic")

Residual standard error: 2.537 on 383 degrees of freedom
Multiple R-squared:  0.1662 
F-statistic: 3.816 on 20 and 383 DF,  p-value: 1.16e-07 
AIC = 772.7    BIC = 856.7

Ftest
                       SS  df     MS F value   Pr(>F)
treatment            0.46   1  0.463  0.0719 0.788707
season              37.94   2 18.970  2.9472 0.053681
year                11.28   2  5.640  0.8762 0.417215
R.score             32.37   1 32.371  5.0291 0.025497
S.score              2.80   1  2.803  0.4354 0.509748
C.score              0.00   1  0.001  0.0002 0.990072
treatment:R.score   52.95   1 52.954  8.2269 0.004356
treatment:C.score   65.29   1 65.287 10.1429 0.001567
season:R.score      31.54   2 15.770  2.4500 0.087647
season:S.score      24.73   2 12.366  1.9211 0.147854
year:R.score        43.06   2 21.532  3.3451 0.036290
year:C.score        26.21   2 13.107  2.0362 0.131930
R.score:C.score     28.24   1 28.243  4.3878 0.036853
S.score:C.score     19.80   1 19.804  3.0767 0.080223
Residuals         2465.27 383  6.437                 

Ttest
                                 Estimate Std. Error t value Pr(>|t|)
(Intercept)                      0.216628   0.554131  0.3909 0.696064
treatment - grazed               0.100399   0.374374  0.2682 0.788707
treatment - ungrazed            -0.100399   0.374374 -0.2682 0.788707
season - autumn                 -0.982083   0.406102 -2.4183 0.016058
season - spring                  0.456660   0.475236  0.9609 0.337202
season - summer                  0.525422   0.455257  1.1541 0.249170
year - 2014                      0.748663   0.571102  1.3109 0.190674
year - 2015                     -0.303283   0.479913 -0.6320 0.527794
year - 2016                     -0.445380   0.480008 -0.9279 0.354064
R.score                         12.133698   5.410655  2.2426 0.025497
S.score                         -3.199071   4.848189 -0.6598 0.509748
C.score                          0.025179   2.022135  0.0125 0.990072
treatment - grazed : R.score     6.020273   2.098934  2.8683 0.004356
treatment - ungrazed : R.score  -6.020273   2.098934 -2.8683 0.004356
treatment - grazed : C.score    -3.909367   1.227511 -3.1848 0.001567
treatment - ungrazed : C.score   3.909367   1.227511  3.1848 0.001567
season - autumn : R.score        5.258101   2.401900  2.1891 0.029190
season - spring : R.score       -2.739087   3.015881 -0.9082 0.364332
season - summer : R.score       -2.519015   2.651050 -0.9502 0.342612
season - autumn : S.score        4.873384   3.018905  1.6143 0.107288
season - spring : S.score       -6.191616   3.371962 -1.8362 0.067102
season - summer : S.score        1.318232   2.969647  0.4439 0.657364
year - 2014 : R.score           -7.443157   3.103236 -2.3985 0.016939
year - 2015 : R.score            2.072046   2.435265  0.8509 0.395384
year - 2016 : R.score            5.371111   2.368027  2.2682 0.023874
year - 2014 : C.score            2.467530   1.456367  1.6943 0.091020
year - 2015 : C.score           -0.196500   1.156556 -0.1699 0.865178
year - 2016 : C.score           -2.271030   1.181944 -1.9214 0.055419
R.score : C.score              -35.308042  16.855766 -2.0947 0.036853
S.score : C.score               26.694777  15.219008  1.7540 0.080223

The model proposed by R does have a lower BIC than our initial model. However, some of the interactions retained by R have no significant effect. We decided to test a model without these interactions.

3.Model adjustment

poll4<-lm(RatioPoll~( treatment + season + year + 
    observer + R.score + S.score + C.score + treatment:R.score + 
    treatment:C.score + year:C.score + 
    observer:R.score + R.score:C.score),data=df)
summary(poll4)


Call:
lm(formula = RatioPoll ~ (treatment + season + year + observer + 
    R.score + S.score + C.score + treatment:R.score + treatment:C.score + 
    year:C.score + observer:R.score + R.score:C.score), data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.5392 -1.1080 -0.4220  0.2947 29.6572 

Coefficients: (1 not defined because of singularities)
                           Estimate Std. Error t value Pr(>|t|)   
(Intercept)                 1.09759    1.02902   1.067  0.28680   
treatmentungrazed          -0.44174    0.74260  -0.595  0.55229   
seasonspring               -0.87180    0.35892  -2.429  0.01559 * 
seasonsummer                0.07737    0.30486   0.254  0.79979   
year2015                   -1.02099    0.89211  -1.144  0.25313   
year2016                   -0.67267    0.91384  -0.736  0.46212   
observerB                        NA         NA      NA       NA   
R.score                     4.06738    6.16403   0.660  0.50974   
S.score                     4.06607    2.32128   1.752  0.08062 . 
C.score                     0.06155    3.33199   0.018  0.98527   
treatmentungrazed:R.score -11.89261    4.19861  -2.833  0.00486 **
treatmentungrazed:C.score   7.72917    2.47671   3.121  0.00194 **
year2015:C.score           -3.34651    2.37134  -1.411  0.15898   
year2016:C.score           -5.55088    2.36817  -2.344  0.01958 * 
observerB:R.score          10.16478    4.53211   2.243  0.02547 * 
R.score:C.score           -11.47597   14.17758  -0.809  0.41875   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.582 on 389 degrees of freedom
Multiple R-squared:  0.123, Adjusted R-squared:  0.09146 
F-statistic: 3.898 on 14 and 389 DF,  p-value: 2.989e-06

We have observed that the R.score:C.score interaction is not significant, we test a new model without it.

poll5<-lm(RatioPoll~( treatment + season + year + 
    observer + R.score + S.score + C.score + treatment:R.score + 
    treatment:C.score + year:C.score + 
    observer:R.score ),data=df)
summary(poll5)


Call:
lm(formula = RatioPoll ~ (treatment + season + year + observer + 
    R.score + S.score + C.score + treatment:R.score + treatment:C.score + 
    year:C.score + observer:R.score), data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.1056 -1.0829 -0.3976  0.2776 29.5917 

Coefficients: (1 not defined because of singularities)
                          Estimate Std. Error t value Pr(>|t|)    
(Intercept)                 1.5512     0.8627   1.798  0.07294 .  
treatmentungrazed          -0.6179     0.7097  -0.871  0.38443    
seasonspring               -0.8549     0.3582  -2.387  0.01746 *  
seasonsummer                0.0786     0.3047   0.258  0.79658    
year2015                   -1.1152     0.8841  -1.261  0.20794    
year2016                   -0.7649     0.9063  -0.844  0.39919    
observerB                       NA         NA      NA       NA    
R.score                     0.9309     4.7917   0.194  0.84607    
S.score                     4.3275     2.2977   1.883  0.06039 .  
C.score                    -1.8351     2.3679  -0.775  0.43881    
treatmentungrazed:R.score -12.5848     4.1088  -3.063  0.00234 ** 
treatmentungrazed:C.score   8.7848     2.1046   4.174 3.69e-05 ***
year2015:C.score           -3.1669     2.3599  -1.342  0.18039    
year2016:C.score           -5.3600     2.3554  -2.276  0.02341 *  
observerB:R.score          10.4201     4.5191   2.306  0.02165 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.581 on 390 degrees of freedom
Multiple R-squared:  0.1215,    Adjusted R-squared:  0.09227 
F-statistic: 4.151 on 13 and 390 DF,  p-value: 1.82e-06

The p-value of S.score is not significant and does not appear in any significant interactions. We therefore decide to remove it

poll6<-lm(RatioPoll~( treatment + season + year + 
    observer + R.score + C.score + treatment:R.score + 
    treatment:C.score + year:C.score + 
    observer:R.score ),data=df)
summary(poll6)


Call:
lm(formula = RatioPoll ~ (treatment + season + year + observer + 
    R.score + C.score + treatment:R.score + treatment:C.score + 
    year:C.score + observer:R.score), data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.3524 -1.1314 -0.4478  0.2407 29.8179 

Coefficients: (1 not defined because of singularities)
                          Estimate Std. Error t value Pr(>|t|)    
(Intercept)                 1.7220     0.8607   2.001  0.04612 *  
treatmentungrazed          -0.5616     0.7113  -0.789  0.43034    
seasonspring               -0.7979     0.3580  -2.229  0.02641 *  
seasonsummer                0.1205     0.3049   0.395  0.69286    
year2015                   -1.1891     0.8861  -1.342  0.18038    
year2016                   -0.8061     0.9090  -0.887  0.37575    
observerB                       NA         NA      NA       NA    
R.score                     1.2011     4.8051   0.250  0.80275    
C.score                    -0.8569     2.3177  -0.370  0.71179    
treatmentungrazed:R.score -11.2879     4.0639  -2.778  0.00574 ** 
treatmentungrazed:C.score   8.3717     2.1000   3.987    8e-05 ***
year2015:C.score           -3.6014     2.3562  -1.528  0.12721    
year2016:C.score           -5.9907     2.3390  -2.561  0.01081 *  
observerB:R.score          11.4602     4.4998   2.547  0.01125 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.589 on 391 degrees of freedom
Multiple R-squared:  0.1136,    Adjusted R-squared:  0.08635 
F-statistic: 4.174 on 12 and 391 DF,  p-value: 3.488e-06

4. Validity of the chosen model

Now,we want to check the normality of residuals

pollfinal<-lm(RatioPoll~(treatment + season + year + 
    observer + R.score + C.score + treatment:R.score + 
    treatment:C.score + year:C.score + 
    observer:R.score ),data=df)
res<-residuals(pollfinal)
hist(res,main="Histogram residuals",xlab="Residus")

shapiro.test(res)


    Shapiro-Wilk normality test

data:  res
W = 0.58302, p-value < 2.2e-16

The residuals are not normally distributed @ref(fig:fig7). We’re going to try and do some transformations using log and sqrt.

poll7<-lm(log(RatioPoll+1)~(treatment + season + year + 
    observer + R.score + C.score + treatment:R.score + 
    treatment:C.score + year:C.score + 
    observer:R.score ),data=df )
summary(poll7)


Call:
lm(formula = log(RatioPoll + 1) ~ (treatment + season + year + 
    observer + R.score + C.score + treatment:R.score + treatment:C.score + 
    year:C.score + observer:R.score), data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.8616 -0.3817 -0.1199  0.2218  2.8725 

Coefficients: (1 not defined because of singularities)
                          Estimate Std. Error t value Pr(>|t|)    
(Intercept)                0.54228    0.18640   2.909 0.003830 ** 
treatmentungrazed         -0.07940    0.15406  -0.515 0.606579    
seasonspring              -0.24842    0.07754  -3.204 0.001467 ** 
seasonsummer               0.05885    0.06603   0.891 0.373370    
year2015                  -0.30506    0.19190  -1.590 0.112717    
year2016                  -0.19844    0.19686  -1.008 0.314073    
observerB                       NA         NA      NA       NA    
R.score                    2.37006    1.04064   2.277 0.023296 *  
C.score                   -0.80696    0.50195  -1.608 0.108721    
treatmentungrazed:R.score -3.21826    0.88011  -3.657 0.000290 ***
treatmentungrazed:C.score  1.65481    0.45479   3.639 0.000311 ***
year2015:C.score           0.16798    0.51029   0.329 0.742186    
year2016:C.score          -0.42087    0.50656  -0.831 0.406573    
observerB:R.score          1.51327    0.97453   1.553 0.121276    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.5607 on 391 degrees of freedom
Multiple R-squared:  0.1889,    Adjusted R-squared:  0.164 
F-statistic: 7.588 on 12 and 391 DF,  p-value: 1.16e-12

The p-value of the shapiro test is still less than 0.05, so the residuals are still not normally distributed @ref(fig:fig8).

poll7<-LinearModel(sqrt(RatioPoll)~( season + C.score + treatment:R.score + 
    treatment:C.score + year:C.score ),data=df)
poll7


Call:
LinearModel(formula = sqrt(RatioPoll) ~ (season + C.score + treatment:R.score + 
    treatment:C.score + year:C.score), data = df)

Residual standard error: 0.7061 on 395 degrees of freedom
Multiple R-squared:  0.1748 
F-statistic: 10.46 on 8 and 395 DF,  p-value: 2.525e-13 
AIC = -272.2    BIC = -236.2

Ftest
                       SS  df      MS F value    Pr(>F)
season              7.389   2  3.6947  7.4097 0.0006932
C.score             0.001   1  0.0009  0.0019 0.9655856
treatment:R.score  23.467   2 11.7333 23.5310 2.213e-10
C.score:treatment   6.411   1  6.4105 12.8562 0.0003785
C.score:year        1.769   2  0.8846  1.7741 0.1709912
Residuals         196.959 395  0.4986                  

Ttest
                      Estimate Std. Error t value  Pr(>|t|)
(Intercept)           0.401349   0.094246  4.2585 2.574e-05
season - autumn       0.069439   0.050345  1.3793 0.1685968
season - spring      -0.213646   0.057100 -3.7416 0.0002099
season - summer       0.144207   0.050304  2.8667 0.0043695
C.score              -0.011798   0.273266 -0.0432 0.9655856
treatment - grazed    4.700511   0.687383  6.8383 3.058e-11
treatment - ungrazed -4.700511   0.687383 -6.8383 3.058e-11
R.score               0.349497   0.643971  0.5427 0.5876277
year - 2014          -0.912854   0.254592 -3.5856 0.0003785
year - 2015           0.246445   0.192734  1.2787 0.2017603
year - 2016           0.666409   0.266422  2.5013 0.0127768

pollfinal<-lm(sqrt(RatioPoll)~( season + C.score + treatment:R.score + 
    treatment:C.score + year:C.score ),data=df)
res<-residuals(pollfinal)
hist(res,main="Histogram residuals",xlab="Residus")

shapiro.test(res)


    Shapiro-Wilk normality test

data:  res
W = 0.87207, p-value < 2.2e-16

Conclusion: We cannot evaluate the impact of grazing on the pollination strategy of plants because the constraints of the general linear model are not respected. In addition, we cannot apply a generalized linear model given the type and values of our data.

GLOBAL CONCLUSION

Grazing does have a positive effect on species richness, and this effect is all the more beneficial in spring.

Grazing has no effect on the seed weight of plants, but does tend to reduce plant height (plant avoidance strategy). However, plant height is not solely a consequence of grazing, it also depends on other physiological parameters (pollination mode, RCS profile, seed weight…) that can be integrated into a model close to a linear model.

Grazing has no effect on the plants’ sexual reproduction strategy, and does not favor the survival of entomogamous plants over others.