Skip to content

Commit

Permalink
Merge pull request #13 from PharmCat/dev
Browse files Browse the repository at this point in the history
Dev
  • Loading branch information
PharmCat authored Jul 17, 2024
2 parents cd2d8fc + d26332c commit e8e672d
Show file tree
Hide file tree
Showing 5 changed files with 113 additions and 43 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/Documenter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ jobs:
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- uses: julia-actions/julia-buildpkg@latest
- uses: julia-actions/julia-docdeploy@latest
env:
Expand Down
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "MetidaStats"
uuid = "75cdad26-409a-4e43-8ad7-d54b4fa665a0"
authors = ["PharmCat <v.s.arnautov@yandex.ru>"]
version = "0.2.1"
version = "0.2.2"

[deps]

Expand Down
66 changes: 66 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,69 @@ Metida descriptive statistics.
```
import Pkg; Pkg.add(url = "https://github.com/PharmCat/MetidaStats.jl.git")
```

## Import DataFrame

```
data = CSV.File("somedata.csv") |> DataFrame
# variables to analyze
vars = [:Cmax, :AUClast]
# sorting variables
sort = [:form, :period]
ds = dataimport(data; vars = vars, sort = sort)
```

## Get descriptive statistics

```
descriptives(ds, stats = [:n, :mean, :var])
```

## Or without dataimport step

```
descriptives(data; vars = vars, sort = sort, stats = [:n, :mean, :var])
```

Keywords:

- `skipmissing` - drop NaN and Missing values, default = true;
- `skipnonpositive` - drop non-positive values (and NaN, Missing) for "log-statistics" - :geom, :geomean, :logmean, :logvar, :geocv;
- `stats` - default set `stats = [:n, :mean, :sd, :se, :median, :min, :max]`;
- `corrected` - use corrected var (true);
- `level` - level for confidence intervals (0.95);

Possible values for `stats` is:

* :n - number of observbations;
* :posn - positive (non-negative) number of observations;
* :mean - arithmetic mean;
* :var - variance;
* :bvar - variance with no correction;
* :geom - geometric mean;
* :logmean - arithmetic mean for log-transformed data;
* :logvar - variance for log-transformed data;
* :sd - standard deviation (or σ);
* :se - standard error;
* :cv - coefficient of variation;
* :geocv - coefficient of variation for log-transformed data;
* :lci - lower confidence interval;
* :uci - upper confidence interval;
* :lmeanci - lower confidence interval for mean;
* :umeanci - lower confidence interval for mean;
* :median - median;
* :min - minimum;
* :max - maximum;
* :range - range;
* :q1 - lower quartile;
* :q3 - upper quartile;
* :iqr - inter quartile range;
* :kurt - kurtosis;
* :skew - skewness;
* :harmmean - harmonic mean;
* :ses standard error of skewness;
* :sek - standard error of kurtosis;
* :sum - sum.
8 changes: 4 additions & 4 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
CurrentModule = MetidaStats
```

Metida descriptive statistics.
Metida descriptive statistics - provide tables with categirized descriptive statistics from tabular data.

*This program comes with absolutely no warranty. No liability is accepted for any loss and risk to public health resulting from use of this software.

Expand Down Expand Up @@ -37,19 +37,19 @@ ds[1:5, :]

### Import:

```
```@example dsexample
di = MetidaStats.dataimport(ds, vars = [:var1, :var2], sort = [:col, :row])
```

### Statistics:

```
```@example dsexample
des = MetidaStats.descriptives(di; skipmissing = true, skipnonpositive = true, stats = MetidaStats.STATLIST)
```

### Make DataFrame

```
```@example dsexample
df = DataFrame(des)
```

Expand Down
78 changes: 41 additions & 37 deletions src/descriptive.jl
Original file line number Diff line number Diff line change
Expand Up @@ -82,38 +82,41 @@ end
* kwargs:
- `skipmissing` - drop NaN and Missing values, default = true;
- `skipnonpositive` - drop non-positive values (and NaN, Missing) for "log-statistics" - :geom, :geomean, :logmean, :logvar, :geocv;
- `stats` - default set `stats = [:n, :mean, :sd, :se, :median, :min, :max]`
- `stats` - default set `stats = [:n, :mean, :sd, :se, :median, :min, :max]`;
- `corrected` - use corrected var (true);
- `level` - level for confidence intervals (0.95);
Possible values for `stats` is:
* :n - number of observbations;
:posn - positive (non-negative) number of observations;
:mean - arithmetic mean;
:var - variance;
:bvar - variance with no correction;
:geom - geometric mean;
:logmean - arithmetic mean for log-transformed data;
:logvar - variance for log-transformed data ``σ^2_{log}``;
:sd - standard deviation (or σ);
:se - standard error;
:cv - coefficient of variation;
:geocv - coefficient of variation for log-transformed data (``CV = sqrt{exp(σ^2_{log})-1}``);
:lci - lower confidence interval;
:uci - upper confidence interval;
:lmeanci - lower confidence interval for mean;
:umeanci - lower confidence interval for mean;
:median - median,;
:min - minimum;
:max - maximum;
:range - range;
:q1 - lower quartile;
:q3,
:iqr,
:kurt,
:skew,
:harmmean,
:ses,
:sek,
:sum
* :posn - positive (non-negative) number of observations;
* :mean - arithmetic mean;
* :var - variance;
* :bvar - variance with no correction;
* :geom - geometric mean;
* :logmean - arithmetic mean for log-transformed data;
* :logvar - variance for log-transformed data ``σ^2_{log}``;
* :sd - standard deviation (or σ);
* :se - standard error;
* :cv - coefficient of variation;
* :geocv - coefficient of variation for log-transformed data (``CV = sqrt{exp(σ^2_{log})-1}``);
* :lci - lower confidence interval;
* :uci - upper confidence interval;
* :lmeanci - lower confidence interval for mean;
* :umeanci - lower confidence interval for mean;
* :median - median,;
* :min - minimum;
* :max - maximum;
* :range - range;
* :q1 - lower quartile;
* :q3 - upper quartile;
* :iqr - inter quartile range;
* :kurt - kurtosis;
* :skew - skewness;
* :harmmean - harmonic mean;
* :ses standard error of skewness;
* :sek - standard error of kurtosis;
* :sum - sum.
"""
function descriptives(data, vars, sort = nothing; kwargs...)
Expand All @@ -124,6 +127,7 @@ function descriptives(data, vars, sort = nothing; kwargs...)
if eltype(vars) <: Integer vars = Tables.columnnames(data)[vars] end
if !isnothing(sort)
vars = setdiff(vars, sort)
if length(sort) == 0 sort = nothing end
end
descriptives(dataimport_(data, vars, sort); kwargs...)
end
Expand Down Expand Up @@ -211,10 +215,10 @@ function descriptives_(obsvec, kwargs, logstats, cicalk)
end
n_ = length(vec)
if cicalk
if n_ > 1 q = quantile(TDist(n_ - 1), 1 - (1-kwargs[:level])/2) end
if n_ > 1 q = quantile(TDist(n_ - 1), 1 - (1 - kwargs[:level]) / 2) end # add tdist / normal option # add multiple CI ?
end
# skipnonpositive
#logstats = makelogvec #calk logstats
# logstats = makelogvec #calk logstats
if logstats
if kwargs[:skipnonpositive]
logvec = log.(skipnonpositive(obsvec))
Expand Down Expand Up @@ -272,21 +276,21 @@ function descriptives_(obsvec, kwargs, logstats, cicalk)
elseif s == :uci
haskey(result, :mean) || begin result[:mean] = sum(vec) / n_ end
haskey(result, :sd) || begin result[:sd] = std(vec; corrected = kwargs[:corrected], mean = result[:mean]) end
result[s] = result[:mean] + q*result[:sd]
result[s] = result[:mean] + q * result[:sd]
elseif s == :lci
haskey(result, :mean) || begin result[:mean] = sum(vec) / n_ end
haskey(result, :sd) || begin result[:sd] = std(vec; corrected = kwargs[:corrected], mean = result[:mean]) end
result[s] = result[:mean] - q*result[:sd]
result[s] = result[:mean] - q * result[:sd]
elseif s == :umeanci
haskey(result, :mean) || begin result[:mean] = sum(vec) / n_ end
haskey(result, :sd) || begin result[:sd] = std(vec; corrected = kwargs[:corrected], mean = result[:mean]) end
haskey(result, :se) || begin result[:se] = result[:sd] / sqrt(n_) end
result[s] = result[:mean] + q*result[:se]
result[s] = result[:mean] + q * result[:se]
elseif s == :lmeanci
haskey(result, :mean) || begin result[:mean] = sum(vec) / n_ end
haskey(result, :sd) || begin result[:sd] = std(vec; corrected = kwargs[:corrected], mean = result[:mean]) end
haskey(result, :se) || begin result[:se] = result[:sd] / sqrt(n_) end
result[s] = result[:mean] - q*result[:se]
result[s] = result[:mean] - q * result[:se]
elseif s == :median
result[s] = median(vec)
elseif s == :min
Expand Down Expand Up @@ -403,13 +407,13 @@ function MetidaBase.metida_table_(obj::DataSet{DS}; sort = nothing, stats = noth
stats STATLIST || error("Some statistics not known!")
if isa(stats, Symbol) stats = [stats] end
if isnothing(sort)
ressetl = collect(intersect(resset, stats))
ressetl = sortbyvec!(collect(intersect(resset, stats)), collect(keys(first(obj).result)))
else
ressetl = sortbyvec!(collect(intersect(resset, stats)), sort)
end
else
if isnothing(sort)
ressetl = collect(resset)
ressetl = sortbyvec!(collect(resset), collect(keys(first(obj).result)))
else
ressetl = sortbyvec!(collect(resset), sort)
end
Expand Down

2 comments on commit e8e672d

@PharmCat
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/111257

Tip: Release Notes

Did you know you can add release notes too? Just add markdown formatted text underneath the comment after the text
"Release notes:" and it will be added to the registry PR, and if TagBot is installed it will also be added to the
release that TagBot creates. i.e.

@JuliaRegistrator register

Release notes:

## Breaking changes

- blah

To add them here just re-invoke and the PR will be updated.

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.2.2 -m "<description of version>" e8e672d1e200a67e9a148319824468beb644eddb
git push origin v0.2.2

Please sign in to comment.