![]() ![]() Split half the data to test set Xte = convert(Matrix, DF)' Take for example –įirst: Split half of the data to Training Set Xtr = convert(Matrix, DF)' In this case, models 9, 14, and 15 are all pretty close together in terms of AIC, so I wouldn’t feel comfortable declaring one “best” without some more work (cross-validation, model averaging…) depending on what you’re using the models for. “Best” in a situation like this could mean a number of different things, so you will probably have to define what your criteria are. + randn.()įormulas = ~ sum(c) for c in term_combis]Ĭomparison = DataFrame(formula = formulas, nterms = length.(term_combis), aic = aics) Running a regression on all subsets of variables requires a few lines of code, but is also pretty straightforward (the following is based on this discussion): using DataFrames, StatsBase, GLM, Combinatoricsĭf = DataFrame(x1 = randn(10), x2 = randn(10), x3 = randn(10), x4 = randn(10))ĭf = df.x1. ![]() I don’t know of such a package–what exactly do you mean by “perform the dimension reduction automatically?” MultivariateStats.jl already allows you to set a maximum number of PCs and/or a proportion of variance to explain when fitting a PCA. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |