Package: FeatureHashing 0.9.2

Wush Wu

FeatureHashing: Creates a Model Matrix via Feature Hashing with a Formula Interface

Feature hashing, also called as the hashing trick, is a method to transform features of a instance to a vector. Thus, it is a method to transform a real dataset to a matrix. Without looking up the indices in an associative array, it applies a hash function to the features and uses their hash values as indices directly. The method of feature hashing in this package was proposed in Weinberger et al. (2009) <arxiv:0902.2206>. The hashing algorithm is the murmurhash3 from the 'digest' package. Please see the README in <https://github.com/wush978/FeatureHashing> for more information.

Authors:Wush Wu [aut, cre], Michael Benesty [aut, ctb]

FeatureHashing_0.9.2.tar.gz
FeatureHashing_0.9.2.zip(r-4.5)FeatureHashing_0.9.2.zip(r-4.4)FeatureHashing_0.9.2.zip(r-4.3)
FeatureHashing_0.9.2.tgz(r-4.4-x86_64)FeatureHashing_0.9.2.tgz(r-4.4-arm64)FeatureHashing_0.9.2.tgz(r-4.3-x86_64)FeatureHashing_0.9.2.tgz(r-4.3-arm64)
FeatureHashing_0.9.2.tar.gz(r-4.5-noble)FeatureHashing_0.9.2.tar.gz(r-4.4-noble)
FeatureHashing_0.9.2.tgz(r-4.4-emscripten)FeatureHashing_0.9.2.tgz(r-4.3-emscripten)
FeatureHashing.pdf |FeatureHashing.html
FeatureHashing/json (API)

# Install 'FeatureHashing' in R:
install.packages('FeatureHashing', repos = c('https://cranhaven.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/wush978/featurehashing/issues

Uses libs:
  • c++– GNU Standard C++ Library v3
Datasets:
  • ipinyou.test - IPinYou Real-Time Bidding Dataset for Computational Advertising Research
  • ipinyou.train - IPinYou Real-Time Bidding Dataset for Computational Advertising Research
  • test.tag - Test.tag

On CRAN:

archivedpackagesr-universecpp

4.85 score 5 stars 141 scripts 294 downloads 1 mentions 7 exports 6 dependencies

Last updated 16 hours agofrom:ea9524c273 (on package/FeatureHashing). Checks:7 OK, 2 NOTE. Indexed: no.

TargetResultLatest binary
Doc / VignettesOKJan 19 2025
R-4.5-win-x86_64NOTEJan 19 2025
R-4.5-linux-x86_64NOTEJan 19 2025
R-4.4-win-x86_64OKJan 19 2025
R-4.4-mac-x86_64OKJan 19 2025
R-4.4-mac-aarch64OKJan 19 2025
R-4.3-win-x86_64OKJan 19 2025
R-4.3-mac-x86_64OKJan 19 2025
R-4.3-mac-aarch64OKJan 19 2025

Exports:hash.mappinghash.signhash.sizehashed.interaction.valuehashed.model.matrixhashed.valueintToRaw

Dependencies:BHdigestlatticemagrittrMatrixRcpp

FeatureHashing

Rendered fromFeatureHashing.Rmdusingknitr::rmarkdownon Jan 19 2025.

Last update: 2025-01-19
Started: 2025-01-19

Sentiment Analysis via R, FeatureHashing and XGBoost

Rendered fromSentimentAnalysis.Rmdusingknitr::rmarkdownon Jan 19 2025.

Last update: 2025-01-19
Started: 2025-01-19

Readme and manuals

Help Manual

Help pageTopics
CSCMatrix%*%,CSCMatrix,numeric-method %*%,numeric,CSCMatrix-method CSCMatrix-class dim,CSCMatrix-method dim<-,CSCMatrix-method [,CSCMatrix,missing,numeric,ANY-method [,CSCMatrix,numeric,missing,ANY-method [,CSCMatrix,numeric,numeric,ANY-method
Extract mapping between hash and original valueshash.mapping
Compute minimum hash size to reduce collision ratehash.size
Create a model matrix with feature hashinghash.sign hashed.interaction.value hashed.model.matrix hashed.value
Convert the integer to raw vector with endian correctionintToRaw
iPinYou Real-Time Bidding Dataset for Computational Advertising Researchipinyou ipinyou.test ipinyou.train
Simulate how 'split' work in 'hashed.model.matrix' to split the string into tokenssimulate.split
test.tagtest.tag