Data Is Considered One of the Most Valuable Assets to Organizations in Today’s World – Course Reflection

Last Updated: 11 Feb 2023
Pages: 6 Views: 97
Table of contents

Introduction

Most organizations use data to make effective business decisions which gives them a competitive advantage over their rivals. The advancement of technology has increased the amount of data being generated daily. It is estimated that 2.5 quintillion byte of data is produced each day but will increase due to the growth of Internet of Things (IoT). Considering the amount of data being produced daily, organizations would have to build big data architecture to handle the amount of data generated and hire new employees or train existing employees to be able to extract these data to provide meaningful insights for decision making.

Big data is a term used to describe data that is large in volume, with respect to the processing system, with a variety of structured and unstructured data containing different data patterns to be analyzed (Akhtar, 2018). The characteristics of big data make it difficult or impossible to be processed using traditional methods. As a result, there is the need to design and build a big data architecture using different tools and systems to be able to accommodate the different data types and structures associated with big data. The importance of big data revolves around using your data to make smart business decisions. Being able to use data to determine the underlying cause of failures, defects, and issues and gathering patterns and trends to make a decision is very vital to every organization. Big data solutions could either be hosted in the cloud using a cloud service provider or be hosted as an on-premise infrastructure. Different underlying factors could help organizations to determine whether a cloud solution would be the best fit for their needs or an on-premise infrastructure solution. Some of these factors include;

Order custom essay Data Is Considered One of the Most Valuable Assets to Organizations in Today’s World – Course Reflection with free plagiarism report

feat icon 450+ experts on 30 subjects feat icon Starting from 3 hours delivery
Get Essay Help
  • Cost – This is a very important factor not just for small organizations but also for big companies. It is one of the most important deciding factors in finalizing any solutions.
  • Security – It is one of the main concerns for most organizations. On-premise infrastructure set up gives organizations a sense of security, knowing that they are in control over who is accessing their data when it is accessed and for what purpose.
  • Scalability – Every big data infrastructure whether on-premise or cloud should be scalable to meet growing demands.
  • Current Capabilities – Another important factor to consider in terms of an on-premise set up versus cloud solution is whether the organization has the personnel to manage the implementation on site.

In both cases, there are always pros and cons to be considered when building or implementing a big data infrastructure for an organization.

Video Analysis

The videos demonstrated the underlying concepts of big data architecture and design. It is always important to understand when people are talking about big data and which technologies will be appropriate for implementation. The structure of the data collected and the purpose for which the data is being gathered is also important. Because of the different formats and structures in which data is obtained, semi-structured and unstructured data types typically do not fit well with the traditional data warehouses. A typical traditional database system uses batch jobs, scheduling them regularly and migrating them into a data warehouse. The data is usually structured and optimized for reporting and analytics purposes only. Because of the downside of a traditional data warehouse, organizations are collecting and analyzing big data using NoSQL databases such as Apache Cassandra, CouchDB, and MongoDB. The presenter mentioned the importance of using Hadoop in big data architecture. Hadoop is an open-source software platform that processes very large datasets in a distributed environment with respect to storage and computational power, and it is mainly built on low-cost commodity hardware (Akhtar, 2018). Apache Hadoop provides a framework for the file layer, cluster management layer, and the processing layer for the architecture. Hadoop has four main components which include;

  1. Hadoop Distributed File System (the file system layer)
  2. Hadoop MapReduce (consist of cluster management and processing layer)
  3. Hadoop YARN (cluster management layer)
  4. Hadoop Common

The most important thing about these modules is that when used together they provide the framework necessary to leverage the Hadoop ecosystem within our big data architecture.

The final video introduced the concepts of cloud computing and serverless web applications. Most organizations are moving their on-premise infrastructure to be hosted by cloud providers, either completely or partially. With the evolution of cloud services, both small and large organizations do not have to bear the initial set up cost in terms of building huge IT infrastructure. They can invest the set-up cost into their business operations to grow the business. Choosing a cloud solution would allow organizations to only pay for the resources they use. Cloud computing provides a lot of advantages. When using cloud services, you will not have to bear any maintenances cost for the cloud infrastructure. You could also have your infrastructure placed in different geographical locations so that the computation and the storage will be closer to the place where it must be utilized. The disadvantage of cloud computing services is that you will have less control over the privacy of your data. Because the data is hosted on the cloud, you may not be able to determine who could have access to your data. Considering both the pros and cons of cloud service and on-premise infrastructure, some organizations sometimes opts-in for hybrid solutions. This type of set-up includes keeping some of the most valuable data within their on-premise infrastructure and some out in the cloud. Using cloud services allows developers within your organization to focus on core product development instead of worrying about managing and operating on-premise infrastructure.

Course Reflection

At the beginning of this course, I knew very little about big data architecture and design. The course has expanded my knowledge and I’m very confident to use the tools introduced in this course to build both personal and organizational projects. Considering the learning objectives introduced at the beginning of this course, I would say that my greatest strength and biggest areas of improvement is being able to utilize the Apache Ecosystem (which include tools like Hadoop, HBase, Hive, Spark, Cassandra, Kafka, etc.) to build big data architecture. Also, using cloud services like Microsoft Azure in my final project gave me a better understanding of leveraging web services when building applications and database systems.

The most difficult lab project was week 4’s lab project where we were tasked to build a frontend and backend architecture for a big data solution. Designing a frontend and backend architecture for any software program or application requires an understanding of what you would like the app to achieve and the tool necessary to accomplish those tasks. Setting up my frontend architecture using the ReactJS was quite challenging since the instructions provided in the text wasn’t as straight forward as expected. Locating the files to modify the React App created an issue and couldn’t successfully execute the modification to reflect the changes. Despite the challenges, what I enjoyed about this lab project was running the RESTAPI using the java programming language. It was a great experience having to use JavaScript in this project. The knowledge gained from the project would help me going forward whenever I would have to use java to execute a project. In all, the lab project was a great experience and but a bit challenging. It would have been better if there were straight forward instructions from the text as to how to execute some of the difficult commands.

For my final project, I build an Azure cloud database system for MySQL Server. The purpose of this build was to be able to perform real-time analytics. The build began by creating a database system in the cloud using the Azure portal and then setting up the server configuration to be used by MySQL. This allowed MySQL to interact with the Azure cloud database system. The final part of the project was to provide real-time analytics using the data available in the cloud. I established a data connection between MySQL and Tableau to perform the analysis. In my present career, I make use of SQL daily, utilizing the language to extract, manipulate and transform data into BI systems for analytic purposes. Since I’m considering a career in data science after my graduate program, the knowledge gained from this project would help me to leverage different web services and applications with Machine Learning and Artificial Intelligence on personal and organizational projects.

Being able to use off-the-shelf commercial tools such as Azure to build streams, databases, virtual machines, etc., as well as leveraging open-source programs such as Hadoop, Cassandra, Hive, etc., has been very helpful since I had little to no knowledge about these tools when building big data architecture.

Conclusion

The concepts of big data architecture and design are very important as the amount of data grows each day. Utilizing tools from the Apache Ecosystem and programming languages such as JavaScript and Python allows organizations to run applications and programs that could aid in their decision-making process. The era of cloud computing has come to stay as small, medium-size and big organization migrates their on-premise infrastructure to the cloud. Also, because most of the data generated are usually semi-structured or unstructured, building a big data architecture to handle these types of data structures is very important.

Overall, the knowledge gained from this course utilizing the tools necessary to build big data platforms would help me going forward in my professional career. I would be able to use these tools and web services to build on projects that would allow organizations to make the best and most effective decisions

Cite this Page

Data Is Considered One of the Most Valuable Assets to Organizations in Today’s World – Course Reflection. (2023, Feb 11). Retrieved from https://phdessay.com/data-is-considered-one-of-the-most-valuable-assets-to-organizations-in-todays-world-course-reflection/

Don't let plagiarism ruin your grade

Run a free check or have your essay done for you

plagiarism ruin image

We use cookies to give you the best experience possible. By continuing we’ll assume you’re on board with our cookie policy

Save time and let our verified experts help you.

Hire writer