The following post discusses the method of ‘percentage correct’ predictions and explains why it may not be the most precise method to measure performance. I also examine the topic of...
Read full article →In this article, I continue exploring Logging as a data set. I have described this type of datasets earlier in Log Management and Big Data Analytics post. In this section,...
Read full article →In the following article, I explore the issue of log collection and analysis, a very specific problem domain for many large organizations. The logging is a suitable example of a...
Read full article →This article is just me thinking loud about creating something better than the simple wordcount.java example that is usually bundled with the Big Data solutions such as Hadoop - which...
Read full article →This is a short guide on how to install Hadoop single node cluster on a Windows computer without Cygwin. The intention behind this little test, is to have a test...
Read full article →This is just a short look at the popularity of MongoDB, Redis and Apache Cassandra. Recently I was a bit surprised by the Google Trend image posted at KDnuggers in...
Read full article →The following article provides a high-level overview of NoSQL databases and the various associated data store types related to these kinds of databases. A particular section of the article is...
Read full article →Recently I came across a statement that said: “MongoDB (btw. that’s MongoDB) uses the BSON format which extends the JSON model to provide additional data types” and I think this...
Read full article →The following article analyses the applicability of the CAP theorem to Big Data. I will explain the CAP theorem, explore the three of its characteristics, as well as provide the...
Read full article →Microtargeting (also micro-targeting or micro-niche targeting) is one of the methods that is used by the marketing sector to analyze consumer data collected from various sources to detect interests of...
Read full article →