Title: | R Actuarial Workshops |
---|---|
Description: | In order to facilitate R instruction for actuaries, we have organized several sets of publicly available data of interest to non-life actuaries. In addition, we suggest a set of packages, which most practicing actuaries will use routinely. Finally, there is an R markdown skeleton for basic reserve analysis. |
Authors: | Brian A. Fannin [aut, cre] |
Maintainer: | Brian A. Fannin <[email protected]> |
License: | MPL-2.0 | file LICENSE |
Version: | 0.1.10 |
Built: | 2024-11-21 06:23:10 UTC |
Source: | https://github.com/casact/raw_package |
This will change the column names of a data frame obtained from the CAS database of reserve triangles. This data was procured and modified by Glenn G. Meyers and Peng Shi and is hosted by the CAS. More information on the data is available here: http://www.casact.org/research/index.cfm?fa=loss_reserves_data
CasColNames(df, restore = TRUE)
CasColNames(df, restore = TRUE)
df |
A data frame of loss reserving data |
restore |
Boolean indicating whether to restore the names defined by Meyers and Shi. The default is true. |
A data frame with different column names.
## Not run: data(NJM_WC) NJM_WC <- CasColNames(NJM_WC, restore = TRUE) ## End(Not run)
## Not run: data(NJM_WC) NJM_WC <- CasColNames(NJM_WC, restore = TRUE) ## End(Not run)
The Casualty Actuarial Society's Committee on Theory of Risk (COTOR) issued several challenges as part of their work. These involved artificial constructed loss data. Actuaries were invited to construct models to estimate appropriate premiums for risk transfer.
Note that there is no COTOR1.
COTOR2
COTOR2 is a vector containing 250 observations of claim values.
COTOR3
COTOR3 is a data frame with 490 observations of 2 variables. The values were generated by the great Stuart Klugman.
Year: an integer indicating the year in which the claim occurred.
Amount: the amount of the claim.
COTOR4
COTOR4 is a data frame with 2500 rows and 2 columns.
Year: an integer indicating the year in which the claim occurred.
Amount: the amount of the claim.
COTOR5
COTOR5 is a data frame with 4849 rows and 5 columns
Risk: A single letter indicating the individual risk to which the claim belongs.
Year: an integer indicating the year in which the claim occurred.
MSA: simulated metropolitan statistical area.
Amount: the amount of the claim.
data(COTOR2)
data(COTOR2)
Listing of the position and windspeed of hurricanes from 1970 through 2011.
Data is taken from the National Hurricane Center, a division of the National Oceanic and Atmospheric Association. A description of the data as well as data for storms in other regions may be found here: IBTrACS.
The fields "SerialNumber", "Basin", "SubBasin", "Nature", "Pressure", "Center", "WindPercentile", "PressurePercentile" and "TrackType" have been removed.
Year: Integer from 1971 through 2011
Number: Integer used to designate the hurricanes in a particular year
Name: The name of the hurricane
ISO_Time: The time the observation was made
Latitude: Latitude where the observation was made
Longitutde: Longitude where the observation was made
Wind: windspeed measured knots (kt)
Pressure: atmospheric pressue measured in millibars (mb)
data(Hurricane)
data(Hurricane)
An object of class spec_tbl_df
(inherits from tbl_df
, tbl
, data.frame
) with 43122 rows and 8 columns.
This function will install the full set of suggested packages.
InstallSuggestions(pkg = "raw", quiet, ...)
InstallSuggestions(pkg = "raw", quiet, ...)
pkg |
Name of the package for which you want to install suggestions |
quiet |
Include additional messages while installing? |
... |
Additional named arguments passed to 'install.packages' |
A character vector indicating which packages were installled.
## Not run: InstallSuggestions() ## End(Not run)
## Not run: InstallSuggestions() ## End(Not run)
Complete triangle of losses for the same 4 private passenger auto and workers comp companies. Data is taken from the CAS.
GroupCode: NAIC company code
Company
AccidentYear
DevelopmentYear
Lag: Integer indicating the development lag in years.
CumulativeIncurred
CumulativePaid
IBNR: Bulk and IBNR reserves
DirectEP: Direct and assumed earned premium, gross of reinsurance.
CededEP: Earned premium ceded to reinsurers.
NetEP: Premium earned net of reinsurance.
Single: Either 1 or 0. 1 indicates a single entity, 0 indicates a group entity.
Reserve1997: Posted reserves as of year-end 1997.
https://www.casact.org/research/reserve_data/ppauto_pos.csv
https://www.casact.org/research/reserve_data/wkcomp_pos.csv
https://www.casact.org/research/reserve_data/othliab_pos.csv
This data represents ten complete years of Schedule P development for many NAIC reporting companies. The data was prepared by Glenn Meyers and Peng Shi and is available from the CAS https://www.casact.org/research/index.cfm?fa=loss_reserves_data.
NOte that the columns have been given more user friendly names. The original column names may be restored by using the CasColNames function.
GroupCode: NAIC company code
Company
AccidentYear
DevelopmentYear
Lag: Integer indicating the development lag in years.
CumulativeIncurred
CumulativePaid
IBNR: Bulk and IBNR reserves
DirectEP: Direct and assumed earned premium, gross of reinsurance.
CededEP: Earned premium ceded to reinsurers.
NetEP: Premium earned net of reinsurance.
Single: Either 1 or 0. 1 indicates a single entity, 0 indicates a group entity.
Reserve1997: Posted reserves as of year-end 1997.
https://www.casact.org/research/reserve_data/ppauto_pos.csv
https://www.casact.org/research/reserve_data/wkcomp_pos.csv
https://www.casact.org/research/reserve_data/comauto_pos.csv
https://www.casact.org/research/reserve_data/medmal_pos.csv
https://www.casact.org/research/reserve_data/prodliab_pos.csv
https://www.casact.org/research/reserve_data/othliab_pos.csv
This data represents ten complete years of Schedule P workers comp development for the New Jersey Manufacturing Company. The data was prepared by Glenn Meyers and Peng Shi and is available from the CAS https://www.casact.org/research/index.cfm?fa=loss_reserves_data. The columns have been given more user friendly names. The original column names may be restored by using the CasColNames function.
This data has been included because it is the triangle used by Greg Taylor and Grainne McGuire in their monograph "Stochastic Loss Reserving Using Generalized Linear Models". https://www.casact.org/pubs/monographs/index.cfm?fa=taylor-monograph03
This is a data frame with 100 observations of 13 variables.
GroupCode: NAIC company code
Company
AccidentYear
DevelopmentYear
Lag: Integer indicating the development lag in years.
CumulativeIncurred
CumulativePaid
IBNR: Bulk and IBNR reserves
DirectEP: Direct and assumed earned premium, gross of reinsurance.
CededEP: Earned premium ceded to reinsurers.
NetEP: Premium earned net of reinsurance.
Single: Either 1 or 0. 1 indicates a single entity, 0 indicates a group entity.
Reserve1997: Posted reserves as of year-end 1997.
data(NJM_WC)
data(NJM_WC)
An object of class tbl_df
(inherits from tbl
, data.frame
) with 100 rows and 13 columns.
PackageDependencies will examine the DESCRIPTION file of a package and return a vector of dependencies.
PackageDependencies( x, type = c("Depends", "Imports", "Suggests", "Enhances", "LinkingTo") )
PackageDependencies( x, type = c("Depends", "Imports", "Suggests", "Enhances", "LinkingTo") )
x |
The name of the packages |
type |
What sort of dependency are you looking for? Permissible values are "Depends", "Imports", "Suggests", "Enhances" and "LinkingTo" |
## Not run: PackageDependencies("ggplot2", "Depends") ## End(Not run)
## Not run: PackageDependencies("ggplot2", "Depends") ## End(Not run)
This function will indicate whether a package has been installed. It uses the installed.packages function for this.
PackageIsInstalled(pkgs)
PackageIsInstalled(pkgs)
pkgs |
Character vector of package names. |
A logical vector indicating whether the packages have been installed.
## Not run: PackageIsInstalled("ggplot2") ## End(Not run)
## Not run: PackageIsInstalled("ggplot2") ## End(Not run)
This function will indicate whether a package needs to be updated. It uses the old.packages function for this.
PackageNeedsUpdating(pkgs)
PackageNeedsUpdating(pkgs)
pkgs |
Character vector of package names. |
A logical vector indicating whether the packages need to be updated.
## Not run: PackageNeedsUpdating("ggplot2") ## End(Not run)
## Not run: PackageNeedsUpdating("ggplot2") ## End(Not run)
This is data taken from Appendix A of the "Basic Ratemaking" study note by Werner and Modlin.
Rate change
This is a data frame with 7 observations of 5 variables.
RateGroup: Letters from A-G. Note that rate group "A" does not have any rate changes.
EffectiveDate: The effective date of the rate change.
RateChange: A number indicating the rate change. Note that this is NOT done as a percent, i.e. .05 = 5
RateLevelIndex: This is one plus the rate change.
CumulativeRateLevel: This is the cumulative product of the RateLevelIndex. Note that this varies slightly from the figures shown in the study note. I believe this is due to rounding in the study note.
Accident year
This is a data frame with 5 observations of 4 variables.
AccidentYear: The accident year.
EarnedExposure: The exposure earned for the accident year.
EarnedPremium: The premium earned for the accieent year.
ReportedLoss: The reported loss and paid ALAE for the accident year.
Premium trend
Data frame with 25 observations of 3 values. The premium is shown at current level. This may be used to determine premium or exposure trend.
QuarterEnding: The ending quarter as a Date type.
WrittenPremium: Premium written during the quarter at current rate level.
WrittenExposure: Exposure written during the quarter.
Loss development
Data frame with 25 observations of 3 values. Note that this loss development data is stored in "long" rather than "wide" format.
AccidentYear: The accident year as an integer.
Lag: The lag in months as an integer.
ReportedLoss: The reported loss and paid ALAE for that accident year and lag.
Loss trend
Data frame with 20 observations of 7 values. The data may be used to estimate loss trend.
YearEndingQuarter: The end of a one-year period as a Date.
EarnedExposure: The exposure earned during the period.
ClosedClaimCount: The number of claims closed during the period.
PaidLosses: The amount of loss paid during the period.
Frequency: The closed claim count divided by the exposure.
Severity: The paid losses divided by the closed claim count.
PurePremium: The paid losses divided by the earned exposure.
ULAE
A data frame with 3 observations of 4 values.
CalendarYear: The calendar year as an integer.
PaidLoss_ALAE: The paid loss and ALAE for the calendar year.
PaidULAE: The ULAE paid in the calendar year.
ULAE_Ratio: The ratio of ULAE to the paid loss and ALAE.
data(PPA)
data(PPA)
This is a packge which stores data used in R workshops sponsored by the Casualty Actuarial Society.
For more information, read the package vignette. To access the package vignette, type in the code shown in the Examples section (the line without the hash marks).
## Not run: vignette(topic = "raw") ## End(Not run)
## Not run: vignette(topic = "raw") ## End(Not run)
Consolidation of state data into various regions. The state data was generated randomly as a Poisson sample.
Region: The name of the region
PolicyYear: An integer between 2001 and 2010
NumPolicies: Integer giving the number of policies in that region for that policy year
NumClaims: Integer giving the number of claims in that region for that policy year
data(RegionExperience)
data(RegionExperience)
An object of class data.frame
with 40 rows and 4 columns.
Random illustrative data
Region: The name of the region to which the state belongs
State: The name of the state
Postal: Two character postal code for the state
PolicyYear: An integer between 2001 and 2010.
NumPolicies: Integer giving the number of policies in that region for that policy year
NumClaims: Integer giving the number of claims in that region for that policy year
data(StateExperience)
data(StateExperience)
An object of class data.frame
with 500 rows and 6 columns.