Keyphrases
Research Integrity
100%
Plagiarism
70%
MapReduce
70%
Text Big Data
50%
Cost-based
50%
Reducer
50%
Robust Cost
50%
Similarity Join
50%
Set Similarity Join
50%
Experimental Survey
50%
Information Services
50%
Information Concept
50%
Concept Retrieval
50%
Education Program
50%
Similarity Search
50%
Cross-modal Similarity
50%
MapReduce Framework
50%
Web-scale
50%
Humboldt
50%
Reproducible Research
50%
Information Retrieval
50%
Hadoop
50%
Join Algorithm
41%
Data Characteristics
33%
Grayscale
30%
Integrity Violation
20%
Image Manipulation
20%
Data Falsification
20%
Default Configurations
20%
Data Replication
16%
Programming Paradigms
16%
Grouped Data
16%
Word Distribution
16%
Computation Cost
16%
Grouping Strategy
16%
Parallelization
16%
Replication Strategy
16%
Intermediate Data
16%
Skewed Distribution
16%
Spill
16%
Mapreduce Programming
16%
Main Memory
16%
Text Data
16%
Dataset Size
16%
Low-similarity
12%
Distributed Settings
12%
Large Amount of Data
12%
Uniform Test
12%
Poor Performance
12%
Small Dataset
12%
Frequent Sets
12%
Similarity Threshold
12%
Different Characteristics
12%
Relative Performance
12%
MapReduce Paradigm
12%
Analytic Investigation
12%
Scholarly Community
10%
Reasonable Suspicion
10%
Admonition
10%
Bad Practices
10%
One-to-one Matching
10%
Community Needs
10%
Matching Algorithm
10%
Experimental Replication
10%
Malpractice
10%
Systematic Use
10%
Detection Device
10%
Image Data
10%
Fraud
10%
Open-source Implementation
10%
Automatic Optimization
10%
Performance Experiment
10%
Implementation Details
10%
Optimization Options
10%
Manual Intervention
10%
Configuration Options
10%
Configuration Parameters
10%
Execution System
10%
Runtime Performance
10%
Configuration Optimization
10%
Computer Science
Map-Reduce
95%
Join Algorithm
75%
Data Characteristic
50%
Image Manipulation
50%
Integrity Violation
50%
Mapreduce Framework
50%
Large Data Set
50%
Hadoop
50%
Main Memory
25%
Parallelization
25%
Relative Performance
25%
Test Environment
25%
Computation Cost
25%
Programming Paradigm
25%
Intermediate Datasets
25%
Grayscale Range
25%
Matching Algorithm
25%
Reasonable Suspicion
25%
Mapreduce Paradigm
25%
Default Configuration
20%
Configuration Parameter
10%
Configuration Option
10%
Implementation Detail
10%
Open Source
10%