Basic for large file. Public audit ability is not

Basic schemes for data
integrity in cloud are Provable Data Possession (PDP) and Proof of
retrievability (PoR). The following section describes the privacy techniques
for data integrity.

 

3.1 Provable Data Possession (PDP)

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

 

Provable Data possession
(PDP) is a technique for assuring data integrity over remote servers. In PDP A
client that has stored data at an unfaithful server can verify that the server
possesses the original data without retrieving it. Working principle of PDP is:

 

 

 

 

 

 

 

 

 

 

 

 

 

                                                                       Fig: 1
Principle of PDP4

 

The client
generates pair of matching keys public & secrete key by using probabilistic
key generation algorithm.

Public key along with
the file will be sent to the server for storage by client and he deletes the
file from its local storage.

The client challenges
the server for a proof of possession for a subset of the blocks in the file.

The client checks the
response from the server.

Challenges in PDP:

Lack
of error-correcting codes to address concerns of corruption.

Lack of privacy preservation.

No dynamic support.

 

3.2 Basic PDP Scheme based on MAC

 

Data owner computes a
Message Authentication Code (MAC) of the whole file with a set of secret keys
and stores them locally before outsourcing it to CSP. It Keeps only the
computed MAC on his local storage, sends the file to the CSP, and deletes the
local copy of the file F. Whenever a verifier needs to check the Data integrity
of file F, He/she sends a request to retrieve the file from CSP, reveals a
secret key to the cloud server and asks to re compute the MAC of the whole
file, and compares the re-computed MAC with the previously stored value.

Challenges in PDP based
on MAC:

The number of
verifications allowed is limited by the number of secret keys.

The data owner has to
retrieve the entire file of F from the server in order to compute new MACs,
Which is not possible for large file.

Public audit ability is
not supported as the private keys are required for verification.

 

3.3 Scalable PDP

 

Scalable PDP uses the
symmetric encryption whereas original PDP uses public key to reduce computation
overhead. Scalable PDP can have dynamic operation on remote data. Scalable PDP
has all the challenges and answers are pre-computed and limited number of
updates. Scalable PDP does not require bulk encryption. It relies on the
symmetric-Key which is more efficient than public-Key encryption. So it does
not offer public verifiability.

Challenges in Scalable PDP:

A client can perform
limited number of updates and challenges.

It does not perform
block insertions; only append-type insertions are possible.

This scheme is
problematic for large files as each update requires re-creating all the
remaining challenges.

 

3.4 Dynamic PDP

 

Dynamic PDP which is a
collection of seven polynomial-time algorithms (KeyGen DPDP, PrepareUpdate
DPDP, PerformUpdate DPDP, VerifyUpdate DPDP, GenChallengeDPDP ,ProveDPDP,Verify
DPDP ). It supports full dynamic operations like insert, update, modify, delete
etc. Here in this technique uses rank-based authenticated directories and along
with a skip list for inserting and deleting functions .It has DPDP some
computational complexity, it is still efficient. For example, for verifying the
proof for 500MB file, DPDP only produces 208KB proof data and 15ms computational
overhead. This technique offers fully dynamic operation like modification,
deletion, insertion etc. as it supports fully dynamic operation there is
relatively higher computational, communication, and storage overhead. All the
challenges and answers are dynamically generated.

Challenges in Dynamic PDP:

It has some
computational complexity.

Not suitable for thin
client.

DPDP does not include
provisions for robustness.

 

3.5 Basic Proof of Retrievability (PoR):

 

Proof of
Retrievability (POR) mechanism tries to obtain and verify a proof that the data
stored by a user in cloud (called cloud storage archives or simply archives) is
not modified by the archive and thereby the integrity of the data is assured. The
simplest Proof of retrievability (POR) scheme can be made using a keyed hash
function hk(F). In this scheme the verifier, before archiving the data file F
in the cloud storage, pre-computes the cryptographic hash of F using hk(F) and
stores this hash as well as the secret key K.To check if the integrity of the
file F is lost the verifier releases the secret key K to the cloud archive and
asks it to compute and return the value of hk(F).

Challenges
in Dynamic POR:

It only
works with static data sets.

It supports
only a limited number of queries as a challenge since it deals with a finite
number of check blocks.

A POR does
not provide in prevention to the file stored on CSP.

 

3.5.1 Data
placed on single server at cloud

Proof of
retrievability for large files using ‘sentinels’. The archive needs to access
only a small portion of the file F. Special blocks (called sentinels) are
hidden among other blocks in the data file F. In the setup phase, the verifier
randomly embeds these sentinels among the data blocks. During the verification
phase, to check the integrity of the data file F, the verifier challenges the
prover (cloud archive) by specifying the positions of a collection of sentinels
and asking the prover to return the associated sentinel values as shown in fig
2.

Challenges
in POR for large files:

This
technique put the computational overhead for large files as encryption is to be
performed on whole file.

This method
put storage overhead on the server, because of newly inserted sentinels and
partly due to the error correcting codes that are inserted.

To check the
integrity of file user need to download whole file which increases of
input/output and transmission cost across the network.

This method
works only with static data.

 

3.5.2 POR
based on keyed hash function hk(F)

 

A keyed hash
function is very simple and easily implementable .It provides the strong proof
of integrity. In this method the user, pre-computes the cryptographic hash of F
using hk(F) before outsourcing the data file F in the cloud storage, and stores
secret key K along with computed hash. The user releases the secret key K to
the CSP to check the integrity of the file F and asks it to compute and return
the value of hk(F). If the user want to check the integrity of the file F for
multiple times he has store multiple hash values for different keys.

Challenges:

Verifier
need to store key for each of checks it wants to perform as well as the hash
value of the data file F with each hash key.

It requires
higher resource costs for the implementation as every time hashing has to
perform on entire file.

Computation
of the hash value for large data files can be computationally burdensome for
thin clients.

 

3.5.3 HAIL

 

HAIL,
high-availability and integrity layer for cloud storage, in which HAIL allows
the user to store their data on multiple servers so there is a redundancy of
the data. Simple principal of this method is to ensure data integrity of file
via data redundancy. HAIL uses message authentication codes (MACs), the
pseudorandom function, and universal hash function to ensure integrity process.
The proof is generated is by this method is independent of size of data and it
is compact in size.

Challenges:

Mobile
adversaries are biggest threat which attack on HAIL, which may corrupt the file
F.

This
technique is only applicable for the static data only.

It requires
more computation power.

Not suitable
for thin client.

 

3.5.4 POR
Based on Selecting Random Bits in Data Blocks

Technique
which involves the encryption of the few bits of data per data block instead of
encrypting the whole file F thus reducing the computational burden on the
clients. It’s stands on the fact that high probability of security can be
achieved by encrypting fewer bits instead of encrypting the whole data. The
client storage computational overhead is also minimized as it does not store
any data with it and it reduces bandwidth requirements. Hence this scheme suits
well for thin client. In these techniques user

 

 

needs to
store only a single cryptographic key and two random sequence functions. The
user does not store any data in its local machine. The user before storing the
file at the CSP preprocesses the file and appends some Meta data to the file
and stores at the CSP. At the time of verification the verifier uses this Meta
data to verify the integrity of the data.

Challenges:

This
technique is only used for Static Data.

No data
prevention mechanism is used in this technique.

No Data
Prevention mechanism is implemented in this technique.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Methodology

 

Single Server

Multi

Require a TPA (Third

POR (Proof of

 

Encrypted

Thin

Entire

 

 

 

 

 

Server

Party Auditor )

retrievability )

 

 

Users

data

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Simplest POR

 

Yes

No

No

Yes

 

Yes

No

Yes

 

 

 

 

 

 

 

 

 

 

 

 

 

 

POR using

 

Yes

No

No

Yes

 

Yes

No

Yes

 

 

sentinels

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PDP

 

Yes

No

No

No

 

NS

No

NS

 

 

 

 

 

 

 

 

 

 

 

 

 

 

SDP

 

Yes

No

No

Yes

 

NS

No

NS

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Kumar &
Saxina

 

Yes

No

No

Yes

 

Yes(Partial)

Yes

Yes

 

 

Proposed model

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Shacham

 

Yes

No

No

Yes

 

NS

Yes

NS

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Kennadi’s HAIL

 

Yes

Yes

No(Optional
)

Yes

 

Yes

No

Yes

 

 

protocol

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MR-PDP

 

Yes

Yes

No(Optional)

Yes

 

NS

No

NS

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Shah

 

Yes

Yes

Yes

Yes

 

Yes

Maybe

Yes

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Wang

 

Yes

Yes

No(Optional)

Yes

 

No

No

No

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Sobol
Sequence

 

Yes

Yes

No(Optional)

Yes

 

No

No

Yes

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4. DATA INTEGRITY CHALLENGES

Comparative study of all Data
integrity techniques is as: