Abstract— anomalies in the data sent from edge devices.

Abstract— Present six billions estimated
devices are connected to internet, by 2020 it will be 25 billion. During this
growth security has been identified as one of the weakest areas in Internet of
Things (IOT). So to meet different challenges in securing IOT, we propose using
machine learning within an IOT Gateway to help secure the system. By using
Regression in a gateway to detect anomalies in the data sent from edge devices.

of Things, machine Learning, Regresson, security

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now

I.      Introduction

Internet of
Things (IoT) is presently a hot technology worldwide. Government, academia, and
industry are involved in different aspects of research, implementation, and
business with IoT. IoT cuts across different application domain verticals
ranging from civilian to defense sectors. These domains include agriculture,
space, healthcare, manufacturing, construction, water, and mining, which are
presently transitioning their legacy infrastructure to support IoT. Today it is
possible to envision pervasive connectivity, storage, and computation, which,
in turn, gives rise to building different IoT solutions. IoT-based applications
such as innovative shopping system, infrastructure management in both urban and
rural areas, remote health monitoring and emergency notification systems, and
transportation systems, are gradually relying on IoT based systems.

Over the past decade, the popularity of Python as a
mainstream programming language has exploded. Notable advantages of Python over
other languages include, but are not limited to; It is a very simple language
to learn and easy to implement and deploy, so you don’t need to spend a lot of
time learning lots of formatting standards and compiling options.

It is portable, expandable and embeddable, so, it is
not system dependent, and hence supports a lot of single board computers on the
market these days, irrespective of architecture and operating system. Most
importantly, it has a huge community which provides a lot of support and
libraries for the language.

We are living in a world
surrounded by billions of computing systems, identifying, tracking, and analyzing
some of our intimate personal information, including health, sleep, location,
and network of friends. The trend is toward even higher proliferation of such
devices, with an estimated 50 billion smart, connected devices by 2020,
according to a recent report by Cisco. These devices generate, process, and
exchange a large amount of sensitive information and data (often collectively
referred to as “security assets” or simply “assets”). In addition to private
end-user information, assets include security-critical parameters introduced
during the system architecture definition, e.g., fuses, cryptographic, and
digital rights management (DRM) keys, firmware execution flows, and on-chip
debug modes. Malicious access to these assets can result in leakage of company
trade secrets for device manufacturers or content providers, identity theft or
privacy breach for end users, and even destruction of human life. Security
assurance of a modern computing device involves a number of challenges. One key
challenge is the sheer complexity of the design. Most modern computing systems
are architected via a system-on-chip (SoC) paradigm, viz., through a
composition of predesigned hardware or software blocks referred to as
intellectual properties (IPs) that interact through a network of on-chip
communication fabrics. The IPs themselves are highly complex artifacts
optimized for performance, power, and silicon overhead. Adding to the
complexity are the communication protocols used in implementing complex
system-level use cases. Finally, security assets are sprinkled at different IPs
across the design, and access to the assets is governed by complex security
policies. The policies are defined by system architects as well as different IP
and SoC integration teams, and undergo refinement and modification throughout
the system development. This makes it challenging to validate a system, develop
architectures to provide built-in resilience against unauthorized access, or
update security requirements, e.g., in response to changing customer needs.
Another source of challenge is the supply chain involved in the development of
a modern computing device. There is a large number of players involved,
including IP providers, SoC design house, and foundry. With the increasing
globalization of the semiconductor design and fabrication process, each of
these players often involves large number of organizations—often across
geography—coordinating to create a complex supply-chain pipeline. Every
component of the pipeline is vulnerable to malicious design alterations,
subversions, piracy, and other security threats. Even in cases where a
component is designed

 without intended malice, aggressive
time-to-market requirements and high optimization needs often result in errors
and vulnerabilities inadvertently left in the design, which can be exploited by
a malicious adversary in the field.

Given the broad spectrum of vulnerabilities and
corresponding mitigation strategies, the subject of SoC security today is
highly fragmented

II.    Related Work

Gaps within
security techniques to protect sensor nodes, to maintain trust between devices,
and to defend against Man in the Middle attacks, Denial of Service (DoS)
attacks, etc. They concluded that there is currently extensive work occurring
within IoT authentication and access control protocols but
other work needs to be done as well. Maintaining the Integrity of the

IoT requires
intelligent processing and reliable transmission within the network. To provide
this, the network architecture contains three layers: the application layer,
the transport layer, and the sensing layer. The application layer contains the
logical link between the user and the Internet through intelligent
applications. Intelligent applications include smart home furnishings and
intelligent architectures. The application layer uses machine learning, data
mining, data processing, and other analytics to process information from the
system and provide an output. The transport layer consists of network
communications including Wi-Fi, Bluetooth, ZigBee, and 802.15.4. The transport
layer contains the gateway or gateways that process the information and relay
the information across the network. The sensing layer contains edge devices
that are composed of a variety of sensors and actuators that collect data and
send it through the transportation layer to the application layer for analysis.
There are many security threats present in the   
transportlayer.Ourapproachistoaddmachinelearningwithin the transport
layer to help determine if there are interruptions in the data transfer and to
monitor the edge devices from the sensing layer. This approach will also
address by addressing the entire system security, not simply the authentication
and access control protocols.


In examining the
approach, we will begin with an overview of our test bed creation and then
discuss our machine learning methodology.


Testbed creation


First I
installed spyder 3.6 and imported libraries in spyder 3.6 after that imported
dataset by using


Dataset =
pd.read_csv (“Data.csv”)


After that I
declared 2 variables X & Y: X selects all columns except last one column in
dataset and Y selects the only last column in dataset by using iloc. If any
missed values are in dataset by using


sk.learn.preprocessing import imputer


Library it
calculate all the values by using mean such that it replaces the missed data in
dataset.  In the given dataset we are
having France, Germany, Spain, 3 variables which are present in X variable. So
the machine learning will get confused regarding that 3 variables because
machine learning can understand only mathematical equations and numbers, so to
solve this problem I have use dummy variables which is known as One Hot Encoder
in the programming. After that I had split the data set in to raining test and
test set with ratio of 0.2i.e.., 80% trained data and 20% test data by using


sk.learn.cross_validation import train_test_split.


In the given
dataset if the values are in different ranges I have used feature scaling to
solve the problem in the machine learning. Finally by using simple linear
regression technique I have predicted the test values by comparing trained


B. Machine Learning Methodology


 Machine learning is the use of algorithms
within a program to learn from collected data. Within machine learning there
are various algorithms that exist to learn from data. We chose to implement a
Simple Linear Regression technique to monitor the system. A Simple Linear
Regression (SLR) is a type of machine learning that is modeled to predict
outcome. To create an SLR, we chose to use Python. Spyder 3.6 is a statistical
programming tool that allows for computations. Packages are readily available
in Python for machine learning, statistics, graphing, probability, etc. We
chose to use Pandas package. The Pandas package allows us to analyze data to
use for predictions.

IV.   Experiments & results

In my data set it contains 50 columns
about an organization like salaries, experience, etc.out of these I have taken
20% of data to test set and remaining 80% as trained set. Finally I predicted
the data as shown as follows

Fig. 1.   
data set in to trained set and test set

test set with trained set

Predicted test set with trained set




1     Janice
Ca˜nedo, Anthony Skjellum,  “Using Machine Learning to Secure IoT Systems,”
Auburn Cyber Research Center. Samuel Ginn College of Engineering. Auckland, New Zealand,  pp. 219-221, 2016.

2     Quamar Niyaz, Weiqing Sun, Ahmad Y
Javaid, and Mansoor Alam on A Deep Learning Approach for Network Intrusion
Detection System, 3rd ed., vol. 2. Oxford: USA, 1892, pp.456-462, 2017.

3     Ren Junn Hwang and Yan Zhi Huang, “Secure
Data Collection Scheme for Wireless Sensor Networks,” 31st International
Conference on Advanced Information Networking and Applications Workshops. New
Taipei City, Taiwan  New York: Academic,
2017, pp. 553-558.

4     Bhavin Patel, Neha Pandya, on “Data
Transfer Security solution for Wireless Sensor Network,” International Journal
of Computer Applications Technology and Research Volume 2– Issue 1, 63-66,
2013, ISSN:  2319–8656.

5     Ionut Indre, Camelia Lemnaru, “Detection
and Prevention System against Cyber Attacks and Botnet Malware for Information
Systems and Internet of Things,” 978-1-5090-3899-2/16/$31.00 ©2016

6     A. Rodr´?guez-Mota?, P.J. Escamilla-Ambrosio†, J. Happa‡, J.R.C. Nurse, “Towards IoT
Cybersecurity Modeling: From Malware Analysis Data to IoT System Representation,”
978-1-5090-5137-3/16/$31.00  2016 IEEE.

7     Alessandro Sforzin† and Mauro Conti,
“RPiDS: Raspberry Pi IDS A Fruitful Intrusion Detection System for IoT,”
2016 Intl IEEE Conferences on
Ubiquitous Intelligence & Computing, Advanced and Trusted Computing,
Scalable Computing and Communications, Cloud and Big Data Computing, Internet
of People, and Smart World Congress 978-1-5090-2771-2/16 $31.00 © 2016 IEEE DOI

8     S. Sridhar, Dr.S.Smys, “Intelligent
Security Framework for IoT Devices,” 978-1-5090-4715-4/17/$31.00 ©2017 IEEE