Recent Orders

Our Reviews

Sample Papers

How It Works

Get First 2 Pages Of Your Homework Absolutely Free!

Messages

Welcome to TutorsOnSpot.Com!

World's No. 1 Assignment Writing Market

Post Your Homework

Proposals

Post your homework and get free proposals here!

Post Your Homework

Stuck in your homework and missing deadline?

Get Urgent Help In Your Essays, Assignments, Homeworks, Dissertation, Thesis Or Coursework Writing

100% Plagiarism Free Writing - Free Turnitin Report - Professional And Experienced Writers - 24/7 Online Support

Get Free 2 Pages Post Your Requirements And Get Free Help

Biographical Dictionary Generator for Password Cracking

Category: Engineering & Sciences Paper Type: Report Writing Reference: APA Words: 9000

Abstract of Biographical Dictionary Generator for Password Cracking

The biological dictionary generator is used to crack the password. It is the proposed idea for conducting this research and to bring several important improvements in the cracking method. It can be done by making some changes in the password cracking method. There are two aspects of hackling one is positive and the other is negative. There are many hackers and attackers in the world that are always trying to crack the password for their particular purposes. There is another side of hacking and it is positive. The study is providing information about the hacking process improvement. According to this fact, the literature review chapter is providing information related to some studies on the recent challenges in cracking password algorithms easier. A cracking program is proposed for the development of a password. This can be done through agile methodology. After completing the development of this program, testing is performed for checking the accuracy of the algorithm. Moreover, this program is also requiring some amendments for working efficiently for the future.

Acknowledgment of Biographical Dictionary Generator for Password Cracking

I am very thankful to my ****** and ****** for their help and their endless support. Moreover, their supervision provided me a lot of help for my project. I would also like to thank Birmingham City University for providing me with the resources I needed to complete this project.

Table of Contents

List of Abbreviations. 8

Chapter 1: Introduction. 9

Introduction. 9

Biographical dictionary generator for password cracking. 9

Aims. 10

Objective. 10

Rationale. 10

Timetable for project. 11

Gantt chart. 12

Problem definition. 12

Scope. 13

Chapter 2: Literature Review.. 15

Review of Existing Knowledge. 15

Chapter 3: Methodology. 22

Agile Methodology. 28

Iterative methodology. 29

Waterfall methodology. 29

Chapter 4: Development. 31

Development. 31

Unified Modeling language. 31

Testing /Result. 32

Chapter 5: Discussion. 33

Discussion. 33

Critical analysis of product. 34

Critical analysis of my process. 34

Chapter 6: Conclusion (200 words) 35

Recommendation and future work. 35

(200 words) 35

Bibliography. 36

List of Figures

Figure 1: Dates for the project. 11

Figure 2: Schedule dates in a Gantt chart for the project. 12

Figure 3: The Iterative Model 28

Figure 4: comparison of PFCG and MPFCG.. 33

List of Tables

Table 1: Personal information type in 12306 dataset. 17

Table 2: Percentage of personal information use in password in 12306 dataset. 20

List of Abbreviations

Acronyms	Abbreviation
2FA	two-factor authentication
RTT	Reverse Turing Tests
EKE	Encoded Key Exchange
PCFG	Password Context-Free Grammar
MPCFG	Modified Context-Free Grammar

Chapter 1: Introduction of Biographical Dictionary Generator for Password Cracking

Introduction of Biographical Dictionary Generator for Password Cracking

The biographical dictionary generator is used for password cracking. It is providing information about the generation of biographical password cracking as well as it can also be modified for bringing some improvements (Lancrenon, 2013). This kind of improvement can also be based on the victim’s data. The study is showing that the data of the victim can be very effective as well as useful to crack the passwords by conducting the attack. This is because it can enhance the speed of the cracking technique. The main aim of the research is to generate the dictionary for cracking a password. Moreover, its objective is to generate an effective method that can easily improve the method of cracking password by matching the data of the user.

The password cracking program will work effectively when the user will change the password of their account (Kody, 2018). This study is showing a proper report on hacking and contains important information about the user’s stats. After reviewing the reports on hacking by Google and Harris poll it is identified that many peoples are using internet contains account and they are also reutilizing their passwords on multiple accounts. On the other side, one-third majority of web and mobile application account holders mostly use different passwords for all accounts, as well as a very few people, are reusing the same password for their accounts (Saliba, 2018). The scope of this research is to take a better understanding of different passwords by cracking techniques.

This can be done through the information biographically. It can be noted that the need of the hour is securing passwords as the demand for social media platforms is increasing day by day on the internet. The password relationship is further divided into three different categories that include; similarity-based; probability-based as well as modification based (Wang, 2016). Furthermore, the literature review section will discuss the effective related studies as well as related works about hacking and also securing passwords (Vigliarolo, 2018). Moreover, for conducting this study. The agile and iterative methodologies are proposed for creating the program for generating a biographical dictionary. This dictionary will be used for cracking the password (Sterling, 2013).

The program focuses on the match and retrieves the information. The user name, email address, phone number, account id as well as password are more focused for the cracking passwords (Bhattacharjee, 2013). For visualization, there are several diagrams created to provide information about the program of cracking a password (Rivest, 2013). To check the accuracy of the proposed technique, the testing and results sections are also provided in this report. Moreover, in the recommendation section, there is some information is present for the future study and the conclusion section will discuss the summary of the whole research.

Biographical dictionary generator for password cracking

This project is looking to explore the generation of biographical dictionaries. This can be improved easily according to the data about the victim. Moreover, it is also useful for the system during cracking the password (Kävrestad, 2018). This is because it will reduce the time and increase speed during cracking. It can be noted that passwords are widely used over different applications, data, and devices to authenticate your identity. The passwords that are used for the authenticity of the user identity. They are started with biographical information about the users. Moreover, sometimes that is asking users to use different characters and symbols. Although there are security settings in place to stop dictionary attacks (Kävrestad, Indexing, Searching, and Cracking. In Guide to Digital Forensics , 2017). All of these settings usually required to be enabled by the user like for example two-factor authentication (2FA).

This project will look at creating custom wordlist passwords using personalized information related to the victim. For conducting this research the data is collected by the individual. From that case, each wordlist will be customized by the victim and he will also include some specific characters for the victim. Moreover, it will also include a specific character length of the required demand. It can be noted that the custom word list is including a complete set of characteristics that are used for generating the password word list (Yazdi, 2011).

Aims of Biographical Dictionary Generator for Password Cracking

This project aims to develop a biographical password generating program with the intent of improving how password cracking dictionaries are created.

Objective of Biographical Dictionary Generator for Password Cracking

The objectives of this project are:

· Research different password cracking techniques.

· Design and develop a password cracking program and test the authenticity.

· Generate biographical information for a password-protected file.

· Explore if biographical information is enough to decrease the time taken to conduct an attack.

· Implement new techniques for generating a biographical dictionary

· Explain the key principles of password cracking in a key detail.

Rationale of Biographical Dictionary Generator for Password Cracking

The main advantage of this research study is to increase awareness of password cracking for different organizations, researchers in the industry alongside the wider audience. This will provide complete information about the side effects of weak passwords and when biographical information is required. It can be noted that the example of a weak password is related to the secondary details, dictionary words and personal information of the user. This is like such a password is based on a favorite football team, date of birth, personal name, nicknames or favorite celebrities. These passwords can be guessed easily by people that know enough about the user. (Notoatmodjo et al , 2009)This paper will explore biographical information that is enough to crack a user’s password. Moreover, it will also implement new techniques that are useful for improving dictionary attacks. This could be helpful if a user wanted to test the strength of their password. Additionally, it provides a stronger understanding for users to secure their data by using a stronger password. Individuals that use the same passwords across multiple accounts and programs can be extremely dangerous because they can compromise the security of one’s password. This project will also help those to recover a forgotten or lost password that may contain biographical data.

Timetable for project of Biographical Dictionary Generator for Password Cracking

Figure 1: Dates for the project

Figure 2: Schedule dates in a Gantt chart for the project

Problem definition of Biographical Dictionary Generator for Password Cracking

This project aims to develop a biographical password generating program with the intent of improving how password cracking dictionaries are created. Hacking has become a norm and increasingly mainstream, which may be seen in a positive or negative light. According to a study conducted by Google and Harris poll, Statistics show that around 52% of individuals re-utilize their passwords for “multiple (but not all) accounts” 35% “use a different password for all accounts” and 13% “Reuse the same password for all their accounts (Services.Google.Com, 2019)

It can be noted that while reusing your password it may appear as an efficient method to help you recall your passwords for significant records. But the main thing is that it will leave you powerless against an information break. Nobody comprehends this more than Facebook CEO, Mark Zuckerberg, who fell victim to hacking and had his internet based life accounts traded off – including Twitter, where hackers tweeted from his record (Fox, 2019). Hackers uncovered the account passwords of well-known CEO's and it was figured out during the security breach from the LinkedIn server. Mark Zuckerberg’s secret phrase for his LinkedIn account, “dadada”, was additionally utilized for his Twitter and other traded off online networking accounts. These sorts of attacks can have immense consequences for your business (Bhole, 2017). Dropbox endured an attack in 2012 that originated from a representative utilizing a similar password for LinkedIn that the hackers utilized for their corporate Dropbox account. Rather than some indiscreet tweets from a hacker, this attack brought about the robbery of approximately 60 million client details. The benefits of this research study will be to increase the awareness of password cracking for organizations, researchers in the industry alongside the wider audience. This will provide a better insight into what a weak password is when using biographical information. This will be outlined in the guidelines. An example of a weak password would be creating a password using a secondary detail such as basing a password on a favorite football team or date of birth of a loved one. The paper will explore if biographical information is enough to crack a user’s password but also implement new techniques that may be able to improve dictionary attacks. This could be helpful if a user wants to test the strength of their password. Additionally, it provides a stronger understanding for users to secure their data using a stronger password. Individuals that use the same passwords across various accounts and programs can account for a danger that can compromise the security of one’s password. This project can also help those to recover a forgotten or lost password that may contain biographical data.

The research study will help educate readers about multiple techniques that are used for cracking passwords with the help of a biographical password cracking program that will be designed. This study aims to not promote hacking. The project will help understand the importance of setting strong passwords and not creating passwords based on biological information which is easy to crack. It will be showcased with the help of the program.

Scope of Biographical Dictionary Generator for Password Cracking

The scope of this research is about understanding the different password cracking techniques that are used using biographical information. With the increasing surge of internet and social media platforms in demand, securing passwords has become the hour of the need. Passwords are one of the widely used parameters that are the result of cyber-attacks as a successfully cracked password gives access to sensitive unauthorized information. User analysis is one method that is used to crack passwords. In this method, analysis of the user is carried out to understand the commonly used phrases, so that hints regarding the passwords can be generated. Hackers can reduce the time taken to crack a password by tracing and understanding the conversation style and characteristics of the user (Wong, 2013). A popular example of this is passwords in the corporate world are mostly aligned to the business activities which make it easy to crack the password. Password relationships are established which is another technique which is used to crack the password for users based on the biological information (Zheng et al , 2018)

The password relationship is further divided into 3 categories which are known as modification based, similarity-based and probability-based.

Modification based is generally about observing the changes that users generally do while changing the passwords and based on this the passwords are cracked. The similarity-based technique is based on the changes which are done by changing the passwords to similar strings. Finally, the probability-based passwords find out the probability derived on the idea of creating passwords that form the chain of finding the next password with the weight assigned to it.

Chapter 2: Literature Review of Biographical Dictionary Generator for Password Cracking

Review of Existing Knowledge of Biographical Dictionary Generator for Password Cracking

Many attacks are not being made by hackers to crack the passwords and as mentioned above the reason behind the same is that the passwords are most vulnerable and with access to it all important information can be exposed.

Brute Force Attacks of Biographical Dictionary Generator for Password Cracking

It is one of the most well-known methods for password cracking of up to eight characters. This is fundamentally a hit-and-miss strategy, as the hacker efficiently checks every single imaginable character, computes the hash of the string blend and afterward contrasts it and gets the password cracked. The success of this attack relies upon the length of the password. In this attack, the hacker attempts every blend of letters, numbers, and accentuates it to create a password key. But if the password is longer than eight characters then this strategy takes additional time: from minutes to quite a while, contingent upon the framework utilized and length of the password (Raza et al, 2012).

Word reference Attacks of Biographical Dictionary Generator for Password Cracking

Word reference attack is like a brute attack but at the same time, there is one significant distinction between these two kinds of attacks. Right now, hacker utilizes a rundown of plausible matches (in light of expressions of the English language, for instance) rather than attempting every single potential character individually. This attack framework frequently incorporates known passwords, words from the English language, sentences from books, and that's only the tip of the iceberg (Sharif & Khan, 2007).

Consolidated Dictionary Attacks of Biographical Dictionary Generator for Password Cracking

Taking the word attack one stride further and including considerably greater multifaceted nature, hackers can consolidate a rundown of existing words with numbers similarly that people may while making new passwords –, for example, by swapping the letter 'e' with '3'. This system is known as a "consolidated dictionary" attack, where the database utilized can contain words from at least one dictionary (Bošnjak et al , 2018)

Dictionary and Rule-Based Dictionary Attacks of Biographical Dictionary Generator for Password Cracking

The crossover dictionary attack is the strategy for taking the words recorded in a word reference and joining them with this by pre-appending three numbers to every section. It gets results, for example, 111apple up to 999apple. This method of attempting to crack a password can take some time, so spicing up the secret phrase mystery with a couple of rules can abbreviate the timeframe it may take to split the password. This technique, notwithstanding, leaves a lot of space for hacker’s intelligence in characterizing the guidelines that the password splitting programming will apply (Madiraju, 2014).

Rainbow Table Attacks of Biographical Dictionary Generator for Password Cracking

A rainbow table is a pre-aggregated table Utilized for recouping hashes. Every rainbow table is for a particular length of secret phrase containing a characterized set of characters. This strategy plans to decrease the speculating time however is constrained to passwords no longer than nine characters and hashes without the password (Zhang et al , 2017).

Biographical Information on Attacks of Biographical Dictionary Generator for Password Cracking

The author presented an investigation of existing dictionary attack anticipation procedures and their downsides. The research explored encoded key trade based conventions for insurance against offline dictionary attacks. For defeating the web word reference attacks, the paper talked about record lockout, deferred reaction, additional calculations, and RTT. The Password-based EKE)by Steven Bellovin and Michael Merritt fuses a mix of cryptographic plans to forestall offline word reference attacks, however, it was later seen that the EKE and variations of EKE conventions are powerless against plain content equality. Thus, permitting a hacker to take on the appearance of an individual by utilizing his hashed secret password caught through spying (Fink, 1997). Furthermore, deferred reactions and record locking are basic countermeasures for online word reference attacks. It reduces the number of passwords that can be speculated in a given time and lock the client account in the wake of arriving at a limit set for fizzled login attempts. (Shapira, 2016) these countermeasures can bring about expanded client support costs because of record locking. Moreover, cyber-attacks can try numerous login attempts corresponding with various client records to dodge postponed reaction and record lockout countermeasures. The additional calculation based system includes the incorporation of non-paltry calculation notwithstanding giving the secret password (Lee, 1999). Such a strategy can join an enormous overhead for secret password attack frameworks as it would require calculation for each login endeavor along these lines reducing the number of attempts. The additional calculation method might introduce convenience issues for an authentic client while a hacker can handle the overhead by utilizing a ground-breaking attack machine or condition.

Borror, (2004) made use of a comparative methodology by fusing the RTT method to keep robotized programs from doing dictionary attacks. Right now, client necessity is to introduce his secret phrase and pass the RTT to ensure the success of a login. Conversely, the author believes that the RTT-based usage is inclined to RTT transfer attacks. A few genuine worlds RTTs (additionally known as CAPTCHAs which allude to Completely Computerized Public Turing test to distinguish Computers and Humans) actualized by famous on the web specialist organizations have been broken in the past Utilizing PC character acknowledgment based activities Methods for improving verification or security against secret key attacks from equipment based arrangements, biometric verification, customer declaration components, graphical secret key plans, matrix-based logins, multifaceted verification and so on (Almasizadeh, 2013).

Shapira (2016) presents a scope of such existing verification components and downsides related to them, which are examined as follows. The equipment and biometric-based confirmation arrangements structure a vigorous verification strategy. There are some disadvantages like extra expenses, overhead connected with the requirement for extra gadgets and relocation from conventional secret word based validation. Moreover, equipment based validation arrangements additionally include ease of use issues because of losing or overlooking gadgets. Multifaceted validation plots regularly join passwords (something which is known) with equipment (something present) or biometric arrangements (something one identifies like) therefore exhibiting the same downsides as equipment and biometric arrangements alongside other accommodation issues. Customer declarations are another arrangement that actualizes a product-based confirmation approach however incorporates disadvantages related to key convenience and capacity.

Analysis of the password security dates back to 1979 by Morrison and Thompson where they did a seminal analysis of more than 3000 passwords (Morris et al , 1979)There are 2 ways in which password cracking analysis can be attained, it is known as password cracking and semantic evaluation. Both these methods focus mainly on the individual password based on the user information and do not focus on the relationships that can be established between the passwords. Thus, to solve this issue a technique was discovered to ensure that the relationship is not neglected when cracking the passwords as it lays down a good foundation to crack the passwords by hackers using biological information. A password relationship is one such method that is based totally on the biographical information about the users to crack passwords. It was discovered that 71% of the simple passwords were less than 6 characters and 86% of that belonged to the same category which was clustered into name category Klein (1992) that they acquired password file and tried to crack the passwords and they were able to crack 21% of the passwords. It was found out that more than 50% of the passwords were less than 6 characters.

Zhang (2010) Stated that the modifications done by a user on its passwords are pretty much predictable and observation was the method that was used to predict and crack the password. Zhang (2010) researched focused on the relationship established on the password of single users however some methods were evolved that focuses on the relationship of the passwords amongst the different users. Based on the biological information and the clustering methods that are used to classify and segment the users.

As stated by Juels & Rivest (2013) that they have proposed a very simple method to improve the security of the hashed password such as the additional maintenance honeywords false password associated with every account of the user. The study is showing that the file of hashed passwords is stolen by the adversary as well as the hash function is also inverted. Furthermore, the hash function cannot notify whether the attacker has found the keyword or the password. Although, the attempted honeyword use to log in the sets of an alarm. The user password can be distinguished by an auxiliary server such as the honey checker from honeywords from the routine of login, as well as an alarm will be set off at the time of submission of honeyword (Juels & Rivest, 2013).

As Li, Romdhani, & Buchanan (2016) has described that it uses text-based passwords effectively for mobile applications as well as several web applications. for both applications web-based as well as mobile-based, the vulnerabilities, as well as the patterns, were also investigated in the paper that is based on the conditions of guessing entropy, Shannon entropy as well as minimum entropy. The study is also providing brief information on how to make improvements substantially on the password strength which is based on the analysis of entropies in the text-based password. Furthermore, the scientists are very sure about the security of the applications using the strong password which may also be designed as well as based on the rememberability, on good useability, security entropies as well as deployability through analyzing the datasets comprising the text-based passwords (Li, Romdhani, & Buchanan, 2016).

Sandvoll (2014) has described that password management is a very important issue for many people across the world. it has the design as well as implementing the system of password management in this study as the IOS application called the PassCue. The PassCue is the password management model that is based on the password shared cues as well as proposed by J. Blocki, A. Datta and M. Blum in naturally rehearsing passwords. Furthermore, the choices related to the design as well as implementation choices including the evaluation of the parameters were very significant in the order to develop a secure as well as usable system. The cues are used by PassCues to share the confidential information or hidden secrets throughout several accounts in the sense of attaining the security as well as competing for usability goals. moreover, the higher security is provided by the PassCue rather than several kinds of very famous multiple password techniques for the management without any type of minimization in the use of the passwords applications significantly.

Sandvoll (2014) also has made a discussion on password management as well as provided some probabilistic results in this study to provide a better understanding related to password security. The pabilitstic results which are provided in this study are showing that an account will be compromised by an attacker in the online attack which is for the passcue (9, 4, 3) and the passcure of (43, 4, 1) as well as for the passcure of (60, 5, 1). Furthermore, Sandvoll (2014) has also explained several significant things in his study that cracking the password related to the passcue of (9, 4, 3) as well as (43, 4, 1) will also take more than thirty-eight years as well as can have the cost more than 700,000 dollars in the offline attack with no leakage of the previous plaintext passwords.

Moreover, the password cracking of the passcue (60, 5, 1) would take more than 1.5 million days as well as the technology is also using the cost around $2.84442×10^(10) nowadays. Thus, the user is not required by the Passcue of (9, 4, 3) for the investigation into the extra time in the sense of maintaining or handling or managing the passwords in the memory while 11 and 20 respective rehearsals must be performed by the user in the sets of the passcues of (43, 4, 1) as well as the passcue of (60, 5, 1). It can also be easily customized to the implementation as well as the design of the passcue to provide support for other security as well as the usability needs and the requirements. furthermore, the low percentage of memory of the iPhone 5 as well as the CPU percentage are utilized by the passcode application, as well as It is also used less than 1% of the percentage of CPU as well as utilizes only 5.9 Mb memory in the idle state (Sandvoll, 2014).

Helkala & Snekkenes (2009) has also explained that it is very easy for humans to think, design as well as generate passwords that can be very easy for them and they can also remember those passwords but those simple generated passwords may be very complex for others and those passwords if other people see then it is very difficult to remember such passwords. Therefore, those passwords can be familiar with the generator but very complicated for other persons that increase the strength of the password. Although, such kind of generated passwords may have the predictable structure of the password or the particular pattern of the password which can make the comprehensive search possible. Furthermore, Helkala & Snekkenes (2009) has also divided the passwords which are generated by the humans in three categories which are: mixture passwords, non-word passwords as well as word passwords. The passwords which are generated by humans and divided by the researchers intot three categories completely depend on their structure.

Helkala & Snekkenes (2009) also stated about the generation as well as the division of the passwords, and they have also analyzed some important aspects of the passwords types which are mentioned previously. The search-space reduction has also been analyzed by the researchers into the mentioned categories of the passwords for many common substructures of the passwords. Therefore, the researchers have derived some notable guidelines for these categories form the analysis that makes the very strong passwords in every category of the password. Thus, the results which are given in this study contribute toward the goals of the project to achieve both memorable as well as strong passwords (Helkala & Snekkenes, 2009).

Chapter 3: Methodology of Biographical Dictionary Generator for Password Cracking

Generally, people tend to use a password that is easy to remember and hard to crack at the same time. More often than not these are mutually exclusive. There is generally a rule or a context-free grammar which people use for their password. To explain this with an example let’s say we denote the alpha string with L, digital with D and special characters with S. One of the commonly used type password takes the form of S₁L₈D₃. Weir et.al studied data of dataset of passwords and they tried to come up in context-free grammar for generating the most common password. They took a probabilistic approach for the context-free grammar they developed and assigned different probabilities to different types of strings after exhaustively studying the password dataset. This context-free grammar can be used to generate a dictionary of the most commonly used password that people use. The probabilistic approach also helps with the sorting of probable passwords. People do not have an infinite amount of computing power when they are trying to break the password. So, it makes a lot of sense to try the most probable password first and move to least probable passwords later on (Weir et al , 2009).

As people tend to use the password that they can remember, later on, most people end up using some biographical information. To help understand the correlation of personal information with user-generated password, a subset of publicly available and leaked dataset of 12036.cn was used. This dataset also has personal information of users along with the passwords they used. The personal information which is also available along with the password is listed in the below table:

Table 1: Personal information type in 12306 dataset

Information Type	Description
Name	Name of the user in Chinese
Email address	The email address of the user
Cell phone number	Cell phone number of the user
Account name	The user of the user which they use to login to the platform
ID number	ID number which the government issued to the user

The ID number also holds the important personal information which can be parsed out like digits 7-14 of the ID is the birth date of the user and digit 17 represents the gender of the user. So, in addition to the information listed above in the table, birth date and gender of the user is also accessible. So, in effect, there are 6 types of personal information that are available namely name, birth date, cell phone number, email address, account name, gender.

To include this personal information in the generic of representation of password with characters L, D, and S, new variables are introduced namely [NAME], [BD], [ACCT], [EMAIL], [CELL]. Usually, people do not use their gender is their password. If “John” who was born in 1965 has password “john1965xyz”, it is represented as “[NAME][BD]L₃” as opposed of L₃D₄L₃ because in that case, we would lose very important information as the biographical information of the user can be just characterized as other digits and letters and knowing those in advance does help in cracking the password.

Algorithm 1: Personal information Matching of Biographical Dictionary Generator for Password Cracking

1. procedure MATCH(pwd,in folist)

39. end procedure

The passwords are then matched to personal information, and for matching the set of all possible substrings of a password is used and is sorted in the ascending order of length of the string. These substrings are then matched with the personal information of the users. Substrings are matched for personal information recursively with the base case of not matching any personal information. There are different techniques for matching each personal information like the name is first converted to English characters and then is matched according to different settings like first_name + last_name, last_name + first_name, first_name_inital + last_name_initial, etc. For birth date, the different permutation is used like only considering the last two digits, considering the whole 4 digits, etc. Similarly, other personal information is also matched. The personal information was used in around 60% of the password. The usage of the personal information is listed in the below table.

Table 2: Percentage of personal information use in a password in 12306 dataset

Information Type	Usage percentage
Birthdate	24.1 %
Account name	23.6%
Name	22.35%
Email	12.66%
Cell phone number	2.7%

This tells that people largely rely on the usage of their personal information for the creation of their password which birth date being the most commonly used personal information in the creation of passwords. The correlation between the personal information and the passwords used by users warrants the modification of “Password Context-free Grammar” (PCFG). The PCFG is extended and personal information is also added to that. In addition to the usual L, D and S more variables are semantic symbols are introduced to cater to the personal information. These variables are B for birth date, N for a name, E for an email address, A for account number and C for cell phone number. Now, the password is matched with the personal information and the number of characters that match with the personal information is also recorded e.g. if John has the password “helloJohn93” it will be translated as L₅N₄B₂. The next phase is calculating the probabilities and generating the dictionary with password guesses. The L words are plugged in from a dictionary of most commonly used in passwords. And according to the probabilities, the symbolic passwords are then stored in a dictionary with the most probable password at the top and least probable passwords at the bottom end of the dictionary. As the personal information of each user differs a lot, the dictionary can not be generated with the personal information at this stage. The attack scenarios are considered to be the case of the known attackers as the personal information of users without is not so easy to get; the cases can only the case of a known attacker or the case of some leaker passwords with personal information.

Coverage is also an important metric. Coverage ranges from 0 to 1. Coverage is directly proportional to the correlation between personal information and password. So, coverage 0 means the no personal information is helpful in the cracking of password and the coverage of 1 means one type of personal information can be used to fully crack the password. Although coverage is calculated for individual password an average coverage can be used as a metric for correlation of personal information and passwords of a whole dataset. The coverage is calculated using the sliding window approach. Passwords and personal information are taken into account. A dynamic-sized window sliding from the start to end of the password with the initial size of 2. In case the segment behind the window matches a certain type of personal information, the size grows by 1. The windows size keeps on increasing until a mismatch is found. This way we find the highest match. At this point, the window’s size is reset and the process restart. At the same time, a tag array with the same length of the password is also maintained. Let’s take the example of [4,4,4,0,0,2,2] as the tag array of length 8. The first 4 elements of tag array {4,4,4,4} matches a certain type of personal information while the next 2 elements {0,0} does not match anything. Similarly, the last 2 elements {2,2} matches another type of personal information. So, coverage is defined as

where is the number of all segments in the tag array, represents the length of the corresponding segment and is the length of the password. So, the coverage of the above example would be 0.3125. Coverage can be a great indicator of selecting the right algorithm for password cracking. If the coverage is close to 0, it does not make much sense to use the algorithms which take into account personal information.

This section of the research paper will discuss the chosen methods for the research project. This research study will be conducted from a secondary research stance. The secondary research method allows the opportunity to experiment so that I can test whether the profile of biographical information is enough to crack a password. Secondary research will be carried out to support my findings against my project. Secondary research is beneficial to this project as opposed to empirical research. Empirical research requires you to collect data from various users on what biographical information their password is based on. This is tricky as it is a security issue and participants and users may not be happy sharing this information. Conducting the project from a Secondary research perspective is a lot more reputable you are going off the research that has been collected from various journals.

This investigation relates to the user’s password security, determining whether biographical information is enough to crack a password. The only way to gain an understanding of this is by experimenting; in this case by setting up dummy profiles that will include biographical data for the test participants to create a password based on the biographical data of the profiles. The profiles will include data such as:

• First and last name

• Date of birth

• Mobile Number

•

Figure 3: The Iterative Model

There are a variety of methodologies that can be used within this research paper such as:

• Agile

• Iterative

• Waterfall

Agile Methodology

The Agile methodology is a process by which a project can be managed by breaking the project up into several tasks (Sacolick, 2020). The task is broken down into six main stages:

• Project Vision Statement (A summarization of the goals of the project goals.)

• Project Roadmap (The standards that need to be achieved for the project vision)

• Project backlog (A priority list of what needs to be done for the project)

• Release plan (A timetable for the realization of the working project)

• Sprint Backlog (The requirements, task, and goals that are linked to the current sprint)

• Increment (The final working project that could be the end product.)

Fernandez and Fernandez (2008) believed Agile to be a “set of values and principles” This methodology works by having an idea of what the finishing project will be and what problem it will solve. An Agile project runs through the process of planning, executing and evaluating. The advantage of using an agile method is that the quality of the project is higher as the task within the project is broken up into several manageable tasks (Fernandez et al , 2008).

Iterative methodology of Biographical Dictionary Generator for Password Cracking

The Iterative model is a cycle for software development focusing on building a simple version of the project up and then adding more complexity to it for the project to meet the final goal. The iterative model contains five stages (Morse A. , 2016)

• Planning and Requirement’s

• Analysis and Design

• Implementation

• Testing

• Evaluation

The Iterative model is preferred over other methods for example waterfall as it allows enhancements to be implemented quickly improving on the last iteration. This method was implemented by NASA about software development to aid the first manned space flight (Morse, 2016)

Waterfall methodology of Biographical Dictionary Generator for Password Cracking

The waterfall methodology is where the specifications of the project are all gathered before the project begins and then a sequential project plan is created to meet the requirements. The activities in the project are broken down into small linear sequential phases. This sort of method is used for large projects and organizations, with benefits including flexibility for early design changes (Sherman, 2015). The stages involved in a waterfall methodology include:

• Requirements (A requirement of what the application should do)

• Analysis (An analysis of models and business logic that will be used in the application.)

• Design (This stage covers the technical requirements such as programming language, services, and data layers.)

• Coding (The code is written up implementing all four stages.)

• Testing (Tests are carried out to discover any bugs that may need resolving.)

• Operation (This final stage is where the application is ready for deployment.)

This project will be carried out using the Iterative model where the focus will be on building the password cracker and then adding more complex features to the program. The iterative model is perfect and known to work well when implemented into small projects this approach when being used within the projects will ensure the program is built to a high quality of standards. A password cracker will be built, and then more complex features will be implemented into the program such as biographical data about the user and also windows registry files such as NTUSER.DAT file.

Chapter 4: Development of Biographical Dictionary Generator for Password Cracking

Development of Biographical Dictionary Generator for Password Cracking

The project demanded a high-performance computer so I went with a work station that had the memory of 1TB. The dataset that was used for evaluation was quite large and needed this kind of work station. Also, the project was done using a python programming language which is a great language for scripting and data analysis and research. Python is equipped with libraries that make the handling of big data a piece of cake and one of the most popular libraries of this category is NumPy which was also used in this project.

Unified Modeling language of Biographical Dictionary Generator for Password Cracking

Unified Modeling Language (UML) is used to visualize the way a system has been designed. The unified modeling language is referred to as a visual language this is because diagrams are used to demonstrate the actions and structure of a software or system. There are up to seven diagrams that can be used to portray how software or system works. These seven diagrams are:

• Class Diagram (Describes the types of diagram objects in the system and the variety of relationship that is existent between them.)

• Component Diagram (These diagrams help with the visualization of the physical components in a system)

• Deployment Diagram (A diagram that specifies physical hardware on which the software system will execute.)

• Object Diagram (A snapshot of a detailed state of a system at a point in time.)

• Package Diagram (Package diagram is used to simplify complicated class diagrams.)

• Composite Structure Diagram (Composite diagram is used to show the internal structure of a classifier including its interaction points to other parts of the system)

• Profile Diagram (A profile diagram is used to enable you to create a domain and platform-specific stereotypes defining the relationship between them.)

These diagrams are known as structure diagrams, “structure diagrams show the static structure of the system” (Fakhroutdinov, 2019). A class diagram will be used for the biographical password cracker as it gives a sense of orientation. This will provide a detailed insight into the structure of the program whilst allowing you to see an overview.

Testing /Result of Biographical Dictionary Generator for Password Cracking

The results of the Password Context-free Grammar (PCFG) are compared with the results of the Modified Password Context-free Grammar (MPCFG). PCFG is one of the best-known algorithms for creating the dictionaries of possible passwords. 12306 dataset was used. Half of the dataset was used for testing data and the other half was used as the training data For the L segments a “perfect” dictionary is used which mean all the possible L segment words were collected directly from the data set which eliminates the possibility of unfair dictionary selection so that all the words are already present in the dictionary. The perfect dictionary ensures that the words will be present there and can be found efficiently. The dictionary contained more than 15,000 words.

An individual number of guesses is used as a metric to compare the effectiveness of MPCFG against PCFG where an individual number of guesses is described as the number of guesses for password generated for each account. The bottleneck for cracking the password lies in the number of hashing operations and adding salt to that and is thus bound by G.N where G is the individual number of guesses and N is the size of dataset being used. For a different number of guesses, the percentage of the entire password set which has been cracked is calculated. Both of these methods have a very quick started as both of the methods start with high probability guesses. In 0.5 million guesses MPCFG achieves a similar rate that PCFG achieves at 200 million guesses which prove the fact the adding personal information to the guesses improves password cracking a lot. And MPFCG will also be able to cover large password spaces meaning it will be able to crack more passwords than FPCG

Figure 4: comparison of PFCG and MPFCG

Chapter 5: Discussion

Discussion of Biographical Dictionary Generator for Password Cracking

The research on the biographical dictionary generator for a password has effectively described in this research document that has a clear main goal. The main objective of this research is to make an effective, efficient as well as optimized algorithm program for the biographical password generation to bring improvements into the password generation method as well as to provide brief information on how the dictionaries related to the password cracking are generated. The hacking of computers, websites, network systems or any other computing device has become the norm a well as mainstream increasingly that has two sides negative and positive. Both of the hacking sides can be seen easily in the real world where many attackers always ready to attack the network as well as computer security for their particular purposes. For the estimation purposes, a study has been conducted and reviewed by Harris poll. In the study, the statistical part has also been shown that the majority of the people in the world have reutilized their old passwords for their multiple webs and mobiles applications passwords, one-third population is using different patterns and password for their multiple accounts in all applications while a minority of the people are using the same kind of passwords for their all accounts active in their website applications as well as their mobiles based application accounts.

Further on the objectives of this study, a comprehensive discussion is made as well as the different kinds of related studies on cracking password techniques are provided. For this purpose, the massive literature review part is added to this study. The main and significant part of this research of this study is to design and develop the program for cracking the password as well as to test the password cracking algorithm for the authentication and verification of the technique. Furthermore, for the development of this program, the agile methodology and iterative methodologies, as well as their processes, analyzed to see whether it will be effective with those methodologies. The algorithm or program which is generated for the cracking the password, it focuses first to match the relevant information of the users as well as it will also match the information with different accounts and then it will crack the password. In the detail of the cracking the password, the password has been divided into three categories. After the creation of the cracking program, the program needed some changes and some modifications. So, the modifications are based is usually observing some changes in the passwords made by the users. The passwords are cracked at that time while changing the passwords. In very simple and meaningful words, the password cracking technique is completely based on the changes in the passwords of users which will be performed by making some modifications to the passwords in the same strings. In the last of the discussion on the cracking passwords technique, the passwords based on the probability identify the possibilities of matching passwords derived by this the idea.

Critical analysis of product of Biographical Dictionary Generator for Password Cracking

The decision of using the personal information in password cracking increases the chances of a password being cracked but it also has some limitations. It can only be used in case the person who wants to crack the password is a close associate as the access to the personal information of random people is not easy and in case there is a dataset which solely has the passwords there is no way of getting the personal information of users. The other case when it is used is the case of leaked datasets which also have personal information of users and leaker datasets usually don’t have this kind of personal information with them.

Critical analysis of my process of Biographical Dictionary Generator for Password Cracking

The process which is used in this project has some limitation as there was only one kind of dataset which was used to evaluate the performance and compare it with other techniques. And the dataset that user does not take into account people of other nations. It is quite possible that people in other countries do not use personal information like Chinese did in their password although it is unlikely without actually evaluating those datasets we cannot be certain.

Chapter 6: Conclusion of Biographical Dictionary Generator for Password Cracking

It is concluded that the password cracking program will work effectively when the user will change the password of their account. The study of the report on hacking is showing very significant information on the statistics of the users. The benefits of this research study will be to increase the awareness of password cracking for organizations, researchers in the industry alongside the wider audience. Hacking has become a norm and increasingly mainstream, which may be seen in a positive or negative light. This study aims to not promote hacking. Hackers can reduce the time taken to crack a password by tracing and understanding the conversation style and characteristics of the user. The success of this attack relies upon the length of the password. In this attack, the hacker attempts every blend of letters, numbers, and accentuates it to create a password key. The user password can be distinguished by an auxiliary server such as the honey checker from honeywords from the routine of login, as well as an alarm will be set off at the time of submission of honeyword. The passwords are then matched to personal information, and for matching the set of all possible substrings of a password is used and is sorted in the ascending order of length of the string. There are different techniques for matching each personal information like the name is first converted to English characters. The dataset that was used for evaluation was quite large and needed this kind of work station.

Recommendation and future work of Biographical Dictionary Generator for Password Cracking

In the recommendation and future work section, it is tried to provide the information about the modifications in the program to make this password cracking program and algorithm more optimize as well as more effective and faster. The program for password cracking should be optimized in the future to provide perfect biographical dictionaries based on the passwords. It is because the password matching technique will become faster than before when the program will genera perfect dictionaries of the text-based passwords. It is also recommended that the program should become more efficient to generate the file dictionary faster and the limit or size of every password dictionary should be increased. Because the algorithm can only generate a limited amount of records. Furthermore, the program for password cracking should have to make some amendments to handle the bottlenecks to crack the password by generating larger sized password dictionaries. In future work, the program to crack the password will work on the probability of the combinations of the password as well as to improve the matches of the passwords and the related information. Every dataset record should be sorted and saved in the dictionary. So, the guessing probability will become faster as well as can provide more accuracy.

Bibliography of Biographical Dictionary Generator for Password Cracking

1. Bhattacharjee, S. (2013). CockcroftWalton generator: Circuit Analysis And Applications. i-Manager's. Journal on Electronics Engineering, , 3(3), 20.

2. Bhole, M. M. (2017). Honeywords for Password Security and Management.

3. Bošnjak et al , L. (2018). Brute-force and dictionary attack on hashed real-world passwords. Conference: 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

4. Fernandez et al , D. (2008). Agile Project Management —Agilism versus Traditional Approaches. Journal of Computer Information Systems, 10-17.

5. Fox, K. (2019). ‘True Biographies of Nations?’: The Cultural Journeys of Dictionaries of National Biography. ANU Press.

6. Helkala, K., & Snekkenes, E. (2009). Password Generation and Search Space Reduction. Journal of Computers, 663-669.

7. Juels, A., & Rivest, R. L. (2013). Honeywords: making password-cracking detectable. Proceedings of the 2013 ACM SIGSAC conference on Computer & communications, 145–160.

8. Kävrestad, J. (2017). Indexing, Searching, and Cracking. In Guide to Digital Forensics . Springer, Cham., 61-70.

9. Kävrestad, J. (2018). Cracking. In Fundamentals of Digital Forensics. Springer, Cham., 93-103.

10. Kody. (2018). Create Custom Wordlists for Password Cracking Using the Mentalist. https://null-byte.wonderhowto.com/how-to/create-custom-wordlists-for-password-cracking-using-mentalist-0183992/.

11. Lancrenon, J. K. (2013). Password-based Authenticated Key Establishment Protocols. In Computer and Information Security Handbook Morgan Kaufmann., (pp. 705-720).

12. Li, S., Romdhani, I., & Buchanan, W. (2016). Password pattern and vulnerability analysis for web and mobile applications. ZTE Communications, 14(S0), 32-36.

13. Madiraju, T. (2014). Dictionary Attacks and Password Selection. Rochester Institute of Technology RIT Scholar Works.

14. Morris et al , R. (1979). Password Security: A Case History. 22(11).

15. Morse , A. (2016, Decmeber 15). Iterative Model: What Is It And When Should You Use It. Retrieved from https://airbrake.io/blog/sdlc/iterative-model

16. Morse, A. P. (2016, Novemeber 23). Rapid Application Development (RAD): What Is It And How Do You Use It? Retrieved from https://airbrake.io/blog/sdlc/rapid-application-development

17. Notoatmodjo et al , G. (2009). Passwords and Perceptions. Proc. 7th Australasian Information Security Conference (AISC 2009),.

18. Raza et al, M. (2012). A Survey of Password Attacks and Comparative Analysis on Methods for Secure Authentication. World Applied Sciences Journal, 19(4), 439-444.

19. Rivest, R. L. (2013). Honeywords: Making password-cracking detectable. . In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, (pp. 145-160).

20. Sacolick, I. (2020, feburary 25). What is agile methodology? Modern software development explained. Retrieved from https://www.infoworld.com/article/3237508/what-is-agile-methodology-modern-software-development-explained.html

21. Saliba, J. (2018). Extracting Gold: Creating Wordlists from AXIOM Cases to Crack Passwords. https://www.magnetforensics.com/blog/extracting-gold-creating-wordlists-axiom-cases-crack-passwords/.

22. Sandvoll, M. B. (2014). Design and Analysis of a Password Management System. Fakultet for informasjonsteknologi og elektroteknikk (IE).

23. Services.Google.Com. (2019). Retrieved from https://services.google.com/fh/files/blogs/google_security_infographic.pdf

24. Sharif, M., & Khan, A. U. (2007). Benchmarking of PVM and LAM/MPI Using OSCA Rocks and Knoppix Clustering Tools in ICCISSE. International Conference on Computer,Information and Systems Science and Engineering.

25. Sherman, R. (2015). Sherman, R. (2015). Project Management. Business Intelligence Guidebook, 449–492.

26. Sterling, C. H. (2013). Biographical Dictionary of Radio. . Routledge.

27. Vigliarolo, B. (2018). Brute force and dictionary attacks: A cheat sheet. https://www.techrepublic.com/article/brute-force-and-dictionary-attacks-a-cheat-sheet/.

28. Wang, R. C. (2016). Phoney: protecting password hashes with threshold cryptology and honeywords. . International Journal of Embedded Systems, , 8(2-3), 146-154.

29. Weir et al , M. (2009). Password Cracking Using Probabilistic Context-Free Grammars. 30th IEEE Symposium on Security and Privacy.

30. Yazdi, S. H. (2011). Analyzing Password Strength & Efficient Password Cracking.

31. Zhang et al , L. (2017). An Improved Rainbow Table Attack for Long Passwords. Procedia Computer Science, 107(C).

32. Zheng et al , Z. (2018). An Alternative Method for Understanding User-Chosen Passwords. Security and Communication Networks, 1–12.