Have you ever wondered what makes the service a ‘cloud’ application? Or alternatively, how does any app become cloud based? The following article dives a little deeper into this often-asked questions and explores some of the characteristics and criteria of the cloud. To illustrate the point, I examine two well-known public services (Airbnb and Spotify) and analyze the ‘cloud application’ sticker and its merit used often with these two these online services. [Read more…]
The following post is a short guide on how to expire Amazon S3 objects by means of defining a simple lifecycle rule. In this tutorial, we’ll be deleting all files in the bucket that are older than 30 days. [Read more…]
Following are slides from Amazon AWS Innovate 2017 Toronto hosted on 10th of May 2017. [Read more…]
I would like to start my initial post by quoting Paul Maritz, CEO of VMware who stated that “Cloud is about how you do computing, not where you do computing” (Davidson, M., 2015).
The following article explores the technological convergence of Intelligent Personal Assistant (IPA). More specifically, it analyzes the emerging technology of Smart Voice-Enabled Wireless Speakers and how the IPA technology transitioned from smartphones to built-in, dedicated home devices. The trend is changing the way in which machine learning and artificial intelligence are used, and the article explores the new and evolving technology in the context of its overall social impact. [Read more…]
The following article analyses how anonymization methods and various other obfuscation techniques fare in their undertaking to defend against the privacy concerns. [Read more…]
Nowadays, it is the volume, velocity, veracity and variety of Big Data that are the primary factors and true amplifiers of the security issues experienced in the large-scale cloud infrastructures. The upsurge in security issues in Big Data installations is predominantly driven by an overall increase in volume and velocity of the data. However, dealing with the diversity of data sources (variety of data) is quickly becoming yet another of the pressing security concerns and the existence of enormous amounts of data is no longer the single factor creating the new security challenges. Data variety is one of the newest security challenges of Big Data infrastructures.
In the following post, I cover the brief history of Enterprise Data Warehouse (EDW), analyze the major challenges of Enterprise Data Warehouse solutions and discuss traditional EDW and their capacity to handle the Volume, Variety, and Velocity (three of the V’s of Big Data). I also explore Big Data platforms as a potential alternative to EDW. [Read more…]
The following post discusses the method of ‘percentage correct’ predictions and explains why it may not be the most precise method to measure performance. I also examine the topic of analytic measurement techniques in general and recommend the correct substitute prediction method for the situation when ‘percentage correct’ is not a suitable performance measurement approach. [Read more…]
In this article, I continue exploring Logging as a data set. I have described this type of datasets earlier in Log Management and Big Data Analytics post. In this section, I suggest an application of a particular partitioning method called: K-means clustering, because I think that it is the most suitable candidate for use within the specific section of a problem domain of log file management. I explain why I considered the k-means technique to be the most appropriate for this type of data. I also cover the advantages that this analysis brings to logging in general, and demonstrate on a real data set the usage of k-means cluster analysis method. [Read more…]
In the following article, I explore the issue of log collection and analysis, a very specific problem domain for many large organizations. The logging is a suitable example of a volume and high-velocity data set, which makes it a good candidate for the application of Big Data analytic techniques. This article is not meant to go into details of how analytic methods perform data classification or certain other analytic tasks; it’s mainly to shed some light on the application of Big Data techniques to logging and outline some of the business benefits of log analysis. [Read more…]
This is a short guide on how to install Hadoop single node cluster on a Windows computer without Cygwin. The intention behind this little test, is to have a test environment for Hadoop in your own local Windows environment. [Read more…]
The following article analyses the applicability of the CAP theorem to Big Data. I will explain the CAP theorem, explore the three of its characteristics, as well as provide the proof of the CAP theorem on an example that is closely related to Big Data use case. I will also briefly discuss couple of possible ways deal with the CAP-related issues in distributed Big Data applications and offer overview of those implementations that best fit each of the CAP properties. [Read more…]
Microtargeting (also micro-targeting or micro-niche targeting) is one of the methods that is used by the marketing sector to analyze consumer data collected from various sources to detect interests of specific individuals. This collection of data is ordered and classified and later provides the information that is used to influence the thoughts of specific like-minded groups of people. That said, one of the major aims of microtargeting initiatives is to simply identify their target audience to as granular level as possible and also identify target’s preferred communication channel. [Read more…]
The following article is my attempt at exploring a niche market of Smart IoT Door Locking Solutions and partially investigate how Big Data analytics could improve this specific sector and thus also our personal life and home security. [Read more…]