-= fRitziE =-: 2009

Friday, December 11, 2009

ASSIGN_02 current trends

Sample articles that has to do with DATABASES

Featured Database Articles
MS Access
December 7, 2009
Database Table and Field Naming Suggestions
By Danny Lesandrini

I chose this topic because it sounded fun but I know it's the kind of thing that could start a holy war among developers and DBAs. After all, who am I to tell anyone how they should name tables and fields? Actually, I'm not the worst choice for advice on naming schemas. It's something I've thought a lot about over the last 14 years.
Some of my naming standards were handed to me on a silver platter by kindly developers with more experience than I had and some were earned the hard way. I can say from personal experience that it's better to receive these lessons free than earn them. Consider some examples. We'll start with field naming.

Field Naming: The Dos and Don'ts
Microsoft Access is a great product that allows you to do practically anything when it comes to naming. It's my opinion that Access was originally written with Excel power users in mind. I think that somewhere in the mid 90's it was hijacked by programmers and taken to levels that Microsoft never intended. It's precisely this feature, field naming, that makes me feel this way. Hard-core developers would never dream of naming things the way Access allows for.

Below is a screen shot of a table I created in Access, which I appropriately named BAD_table. It encapsulates a number of bad practices that I see all the time when I assist clients with Access database applications that were built in-house by people who could probably be described as Excel Power Users.
They know enough to be dangerous, but they don't know why. If you're an Excel Power User, you might be feeling the hairs on the back of your neck stand on end as you read this, but stick with it for another couple paragraphs and you'll understand what I mean. The problems that come from bad naming conventions don't manifest themselves until one begins writing code but by then it's too late to change them.

So, what's wrong with these field names?

[ID]
There's technically nothing wrong with naming a field [ID] but once you start creating queries, you'll find that this is too ambiguous and you'll wish you'd named them differently. Consider this SQL statement.
SELECT * FROM tblContact c
INNER JOIN tblEmployee e ON e.ID = c.EmployeeID
INNER JOIN tblStatus s ON s.ID = c.StatusID
All three of these tables have a primary key field named [ID] but when these ID fields show up as foreign keys in other tables, their names must be changed. Thus you get links that read ID=EmployeeID and ID=StatusID instead of EmployeeID=EmployeeID and StatusID=StatusID. Some will write to tell me that this doesn't end up being as confusing as I'm making it out to be but I've been there, done that, and it's my suggestion that you avoid this approach.

[Contact Name]
There are potentially two things wrong with this field name. First, one might split this field into [ContactFirstName] and [ContactLastName], though an equally good argument might be made for keeping them together. It's just that sometimes you might want to display a name as First-Last and sometimes as Last-First. If you use only one field for the full name, you forfeit that flexibility.
The second problem, which we see a lot of, is the embedded space. Access allows you to do this, even though you will curse the day you added spaces once you start coding. It may seem like a little thing, but I've written so many thousands of lines of code that I start looking for shortcuts when typing. If you have spaces, you'll simply have to do more typing.
For example, consider this code for accessing fields in a DAO recordset:
sSQL = "SELECT FirstName, [Birth Date] FROM tblContacts"
Set rst = dbs.OpenRecordset(sSQL, dbOpenSnapshot)
Debug.Print rst!FirstName
Debug.Print rst![Birth Date]
The extra typing I'm referring to is the set of square brackets. You must add them to both the SQL statement and to the recordset field assignment. It gets more funky when you start creating forms with the wizard, which will name the associated text box Birth Date like the field. At least it appears to. Behind the scene, it uses an underscore for the space, so the control is referenced as Birth_Date. This can be seen in the event sub procedure assignments. For example, the after update event of the [Birth Date] field would be ...

Private Sub Birth_Date_AfterUpdate()
I guess the point is that it's not necessary to add spaces and contributes nothing except more typing. Access allows you to supply a caption for a field, which automatically provides for a readable alias when the field is added to a query. If you want spaces, update the Caption property but don't put them in the actual field names.

[# of Cars] and [Got Milk?]
Everything mentioned above about spaces applies equally well to special characters, although there are some special characters that Access won't allow you to use, namely . ! [ ]. These characters mean something in SQL Statements. While the pound sign and question mark are not operators and may be used, they present other problems, as does starting a field with a number.
Below is a screen shot of a table I tried to create in SQL Server. Notice that I succeeded in creating a field that begins with a number, namely [1stField]. The brackets were added automatically by SQL Server as a quiet protestation to the name. In fact, SQL Server 7 didn't even allow field names that began with a number. In my example below, [Got Milk?] is allowed with the question mark provided the field is bracketed but it choked on the pound sign.
So while Access will allow me to name a field as [# of Cars], SQL Server will not. That means that in addition to making my code more complicated, this Access table cannot be upsized to SQL Server, should that whim ever strike me.

[Last time he/she went to a Rockie's baseball game]

My spell checker now tells me that Rockie's isn't correct, which is the first thing wrong with this field name. J The next is the spaces and special characters. In fact, I'm not sure why a single quote is allowed. That's going to wreak havoc in SQL Statements, I'm sure. I don't like the foreword slash either, though I can't say exactly what harm it may or may not do.
The real problem is the length of this field name. Somehow, there has to be a balance between too much abbreviation and too many words. Do you really want to have to type a field name that big? On the other hand, simply removing the spaces and vowels is no better. Who could decipher a field named LstTmHShWntTRckysBsbllGm. In my GOOD_table below, you see I renamed this field to BaseballNote. That's a good compromise: self-explanatory enough and not too verbose.

[Name] and [Date]
For these last two, it's probably not so obvious as to why I object. Try this as an experiment. Open any Access database and press Ctl+G to bring up the immediate window. (Bottom of the VBA environment window.) Now type a question mark and one of these field names, and then hit the ENTER key. Below are the results of these two field names, and other potentially poor choices.
?Name
Microsoft Access
?Date
7/30/2009
?Now
7/30/2009 9:42:20 AM
?CurrentUser
Admin
These words represent built-in functions within Microsoft Access. The Date() function returns today's date. Now while you are allowed to create a field named [Date] it means you might end up writing SQL that looks like this ...
SELECT * FROM tblOrders WHERE Date=Date()
Does this SQL break? Not that I know of but it lends itself to ambiguity. The question is, why would you want a field named [Date]? What date is it? OrderDate or ShipDate? If you have a field named [Name], what name is it? Person name or product name or what?
Below is a screen shot of the corresponding GOOD_table. The ID field is not ambiguous, spaces are removed and no special characters were used. The field names are descriptive, yet short enough to be reasonable when typed. There's one other subtle but important difference between this screen shot and the previous table image. Can you see it?

Here's where I have to give credit to a great programmer-friend of mine, Jim Pilcher. It was nearly a decade ago when I did some work for him that he introduced me to his rule-of-thumb for sizing text fields, and I'm going to share it with you here.
Notice above in the BAD_table that the [Contact Name] field is 50 characters. What this tells you is that I accepted the Access default size and probably didn't think about what size it should really be. Now perhaps 50 characters represents 20 for First Name and 30 for Last Name, or maybe 25 for each ... or maybe I just didn't think about it at all. There's no way to know for sure.
Jim's convention when choosing field size is to use a binary value, 2 to the power of something. With this plan, field sizes will always be in this list: 1, 2, 4, 8, 16, 32, 64, 128, 255. (Access doesn't allow fields of size 256 because it needs the last bit for its own purposes.) I've added 48 and 96 to that list even though they are not powers of 2, because I like those sizes for certain fields. Have you figured out the advantage yet?
There are two benefits to be had with this sizing convention. First, as Jim pointed out, any time he sees a field in a table that does not have one of these sizes, he knows he did not personally add it. Maybe a consultant did, maybe the client, but certainly not him. It's like the parakeet in the coal mine.
Second, it lends itself to consistency. I've set the [FirstName] field size to 16 characters. I've found that to be adequate, knowing that 8 characters is going to be too small and 32 is overkill. Next time I create a [FirstName] field, I will go through the same process and I'll end up at 16 and my next [FirstName] field will be consistent with my last one. Same for [City], which is always 32 and [Email] which is always 64 characters. Consistency! This naming tends to make my field sizing more consistent.
For what it's worth, choosing powers of 2 is arbitrary. One could just as easily use a Fibonacci number series ( 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233) and accomplish the same goals.

Table Naming
Much of what has already been written about field names applies to tables. One thing I omitted in the discussion above is the use of capital letters. I personally prefer to use camel case, with the first letter of words jutting up like camel humps. Some (read Oracle developers here) choose to use all CAPITALS separated by underscores.
This, even more so than conventions already discussed, is a personal matter. However, WE ALL KNOW WHAT IT'S CALLED WHEN YOU WRITE AN EMAIL WITH ALL CAPITAL LETTERS: SHOUTING! Please don't use all capital letters when naming fields and tables ... I may have to work on your database some day and it offends me.
The next thing to consider is whether or not you will use table prefixes. Will your table be named tblEmployee or Employee? I actually don't have a strong feeling about this, but I guess I tend to prefix the table. Maybe because I don't like any ambiguity, and I sometimes use the attribute key word as an alias in my SQL statements, like this ...
SELECT FName + ' ' + LName AS Employee FROM tblEmployee
Finally, we must mention the question of plurals. Is the table tblEmployee or tblEmployees? One record of the table is one employee so the whole table is all the employees. My first inclination is to make table names plural. However, English sometimes sucks and the singular word "Category" does not become "Categorys" but "Categories". "Status" becomes "Statuses". I hate this inconsistency so much that I tend to go singular with table names.

Featured Database Articles
Sybase
December 8, 2009
Sybase ASE 15: Very Large Storage Systems
by Jeffrey Garbus

Executive Summary:
In this past year, the author has had multiple customers accelerate their migrations to ASE 15 with one goal in mind: taking advantage of Very Large Storage Structures (VLSS) in ASE 15 so that they can increase the size of their growing databases to something with an almost unlimited upside. The idea behind VLSS is that Sybase has modified the ASE system tables to accommodate the larger limitations. In this article old structure, new structure, and practical limitations are discussed.

Introduction
From its inception, Adaptive Server Enterprise (ASE) has been a high-volume Online Transaction Processing (OLTP) or Decision Support System (DSS) or mixed-use database, capable of storing, managing, and retrieving an enormous amount of data.
Over time, the definition of the word “enormous” has changed.
In the early 1990s, we designed and rolled out an application, which contained about 65 gigabytes of data. At the time, our contemporaries wondered if the system would even accommodate a table with 2.5 billon rows of data in a single table (it did). It was, at the time, the 4th largest ASE installation in the world. Anything over 20 gig was considered a Very Large Database (VLDB). Within a few months it had doubled in size, and a few months later doubled again to 250 gig, a monster in those days. The entire project was cost-justified in storage savings: a 2-gigabyte DASD device (for their DB2 applications) cost approximately $40K, while a 2 gigabyte SCSI drive cost about $2K.
Today, you can buy an external 1Terabyte hard drive for under $100. Times change.
In this past year, I have had multiple customers accelerate their migrations to ASE 15 with one goal in mind: taking advantage of Very Large Storage Structures (VLSS) in ASE 15 so that they can increase the size of their growing, 4 terabyte databases to something with an almost unlimited upside. (Yes, famous last words, but with a potential upside of 2 billion devices of 4 terabytes each, I’ll risk eating those words).
The idea behind VLSS is that Sybase has modified the ASE system tables to accommodate the larger limitations.
In this article, we’re going to discuss old structure, new structure, and practical limitations.
Pre-ASE 15 Storage
Prior to ASE 15.0, ASE was limited to 256 logical devices and 32GB for individual devices. Note that when these limits were originally put in place, 20 gig was thought to be a Very Large Database (VLDB), and it was years after before anyone attempted one. In addition, because of an intra-database addressing limitation, an individual database was limited to 4T (if you have a 2K page size) and 8T (if your page size is larger). These limitations were irrespective of physical device or OS-level limitations.
This was driven by the fact that ASE’s storage was organized by virtual pages using a 4 byte bit mask. The virtual page was the mechanism the server used to connect the logical concept of a page with the physical disk. The high order byte of the vdevno column in the sysdevices table was used to encode the device id, and consequently the byte of 8 bits limited ASE to 256 devices (8 bits as a signed integer). (Of course, with one device set aside for master, and perhaps others set aside for tempdb, sysaudits, etc., there were sometimes fewer than 256 available for data). The remaining 3 bytes were used to track the actual virtual page numbers – which considering a 2K page (the storage factor in sysdevices) provided a limit of 32GB per device. Note that theoretically, a 16K server could have large devices, but due the default of 2K and the implementation of sysdevices, a limit of 32GB was imposed.
Here’s a picture of the vdevno column:

Additionally, each database had a limit of 8TB – this was based on 256 devices of 32GB each. But, one database may only have up to 2^31 (2,147,483,648) logical pages, so its maximum size also depends on its logical page size:
2K page server 4TB
4K page server 8TB
8K page server 16TB
16K page server 32TB
Previous limits in early 10 & 11.x timeframes of the numbers of devices per database and device fragments limits were removed a long time ago (they had formerly limited a database to 32 devices. Without the device limit of 32GB, the max database size would be 32TB based on ~2 billion 16K pages.
These limitations caused consternation in a variety of shops with a real need to store database, which at least for today are still considered very big.
Very Large Storage System in ASE 15
In ASE 15, Sybase has implemented what is being called the “Very Large Storage System (VLSS).”

The vdevno/vpageno combined 4-byte integer from the “low” column has been split into separate columns. This has required the rewrite of a significant part of server code; page addresses impact the page number and page header of every physical page – not just the system tables. As a result, ASE now supports ~2 billion devices of 4TB each. It still has the 32,767 limit for the number of databases per server, with a practical limit of roughly 100 due to backup timeframes, etc.
As a result for databases, the upper limit is now driven by the integer representation for pageno and the page size or 32TB as described earlier. The theoretical server size is now 8 million terabytes (8 Exabytes) which unfortunately is limited by the number of databases to 1 Exabyte.
If you have an OLTP in the terabyte range, this is for you (but note the comments later on partitions!). However, if you have a DSS system, you should be giving strong consideration to Sybase IQ.
So:
Devices 231 (2 Billion)
Maximum device size 4TB
Databases/server 32,767
Yields Maximum Storage:
Database size 231 pages * 16KB pg = 32 TB
Theoretical DB storage 32,767 DB’s * 32TB = 1 EB (exabyte) = 1,048,544 TB
Theoretical server size 231 devices * 4TB size = 8,589,934,592 TB
Comparing pre-ASE 15 limits to newer (sic) limits, we get
Attribute Old (12.5.x) Limit New (15.0) Limit
Number of devices 256 2,147,483,648 (2^31)
Maximum device size 32 Gb
32GB ß 224 * 2K vpg 4 Tb
Maximum database size (2K / 4K / 8K / 16K) 4 Tb / ~8 Tb / ~ 8 Tb / ~8 Tb
8TB ß 256 devices * 32GB 4 Tb / 8 Tb / 16 Tb / 32 Tb
Not bad; pre-15, an entire database/server could only take up 4 Tb; with ASE 15, we can have 2 billion devices of that size.
In ASE 15.0, the virtual page number is represented by two 32-bit values. One is the device number; the other is the block number

A Word on Partitions
Storing the data is half the battle. Maybe a third.
The rest of the battle involves data manipulation and maintenance.
When you insert a row, how does the server know where the next empty page is? When you retrieve a row, how long will a table scan take? How do you backup a 10Tb database? Run dbcc? Update your statistics?
Databases in the terabyte range are nontrivial to manage at this point in history. (In another decade, they’ll fit in my wristwatch). For today, we need to plan a bit.
With the advent of ASE 15, Sybase has added real semantic partition management to Adaptive Server Enterprise. This means that you can separate your data into a variety of partitioning schemes (subject of a separate article).
Partitioning, though, is extremely relevant to large tables (which tend to go along with large databases), because of everything from insertion rates to data archival.
Partitioning nets you:
• Partitioned indexes, which yields faster access performance, smaller index structures, and improved parallel search.
• Partitioned data, which allows you to organize the data based upon your own rules (hash, data range, list partitioning, round-robin partitioning), which lets you either spread io across multiple LUNs, or to allow archiving to hit only one of several LUNs, without interfering with online access.
• Partition-aware maintenance, which lets you (for example) update statistics or run dbcc on only the more recent partitions. You can also truncate, reorg, or bcp on a partition basis.

Thursday, November 19, 2009

ASSIGN_01

Contrast between Hierarchical Database and Relational Database

The difference between the two is that hierarchical database is a database organization method that is structured in a hierarchy. All access to data starts at the top of the hierarchy and moves downward; for example, from customer to orders, vendor to purchases, etc. Contrast with relational database and network database.

A hierarchical data model is a data model in which the data is organized into a tree-like structure. The structure allows repeating information using parent/child relationships: each parent can have many children but each child only has one parent. All attributes of a specific record are listed under an entity type.

Example of a Hierarchical Model.

In a database, an entity type is the equivalent of a table; each individual record is represented as a row and an attribute as a column. Entity types are related to each other using 1: N mapping, also known as one-to-many relationships.
The most recognized and used hierarchical database is IMS developed by IBM.

On the other hand, relational database matches data by using common characteristics found within the data set. The resulting groups of data are organized and are much easier for people to understand.
For example, a data set containing all the real-estate transactions in a town can be grouped by the year the transaction occurred; or it can be grouped by the sale price of the transaction; or it can be grouped by the buyer's last name; and so on.
Such a grouping uses the relational model (a technical term for this is schema). Hence, such a database is called a "relational database."

The software used to do this grouping is called a relational database management system. The term "relational database" often refers to this type of software.
Relational databases are currently the predominant choice in storing financial records, manufacturing and logistical information, personnel data and much more.

Friday, August 14, 2009

My_Idea_Is

A. Discuss what you have learned and understood about Database Management System is, so far.

It is a special data processing system, or part of a data processing system, which aids in the storage, manipulation, reporting, management, and control of data. It is a program that lets one or more computer users create and access data in a database.

B. Define how each of the following fit and function within the framework of relational DBMS.

Data fields - includes field structure (size of each field and whether it is a date, an integer, or a text field) and field organization (names and locations of data fields within a document record).

Forms - it can be used to facilitate database data entry and/or retrieval operations. Its outcome is better compared to tables.

Foreign Key – it identifies a column or a set of columns in one (referencing) table that refers to a column or set of columns in another (referenced) table. The columns in the referencing table must be the primary key or other candidate key in the referenced table. The values in one row of the referencing columns must occur in a single row in the referenced table.

Queries - are the primary mechanism for retrieving information from a database and consist of questions presented to the database in a predefined format.

Records - A database record consists of one set of tuple for a given relational table. In a relational database, records correspond to rows in each table.

Reports - A database report presents information retrieved from a table or query in a preformatted, attractive manner.

Table Linkages – this is where primary key and foreign key are used to link each other.

Sunday, July 5, 2009

MY_ASSIGNMENT

DATA TYPES

a.)
A data type (or datatype) in programming languages is a set of values and the operations on those values.

Almost all programming languages explicitly include the notion of data type, though different languages may use different terminology. Most programming languages also allow the programmer to define additional data types, usually by combining multiple elements of other types and defining the valid operations of the new data type. For example, a programmer might create a new data type named "Person" that specifies that data interpreted as Person would include a name and a date of birth. Common data types may include:

* integers,
* floating-point numbers (decimals), and
* alphanumeric strings.

For example, in the Java programming language, the "int" type represents the set of 32-bit integers ranging in value from -2,147,483,648 to 2,147,483,647, as well as the operations that can be performed on integers, such as addition, subtraction, and multiplication. Colors, on the other hand, are represented by three bytes denoting the amounts each of red, green, and blue, and one string representing that color's name; allowable operations include addition and subtraction, but not multiplication.

A data type also represents a constraint placed upon the interpretation of data in a type system, describing representation, interpretation and structure of values or objects stored in computer memory. The type system uses data type information to check correctness of computer programs that access or manipulate the data.

b.) What role do they play in database?

The database itself can help to keep bad data out (for example you can't put 'text' into a number only field) and can speed things up especially searching and sorting?
Data type selection plays an important role in how efficiently your RDBMS functions, and so it's important to be fully aware of the options available to you, and to select the most appropriate data type for your storage needs. That's where this document comes in. It outlines the most important data types supported by PostgreSQL, describing when and how each should be used, and provides you with a ready supply of choices the next time you sit down to optimize your existing databases or create new ones.

c.) Example of data types in DBMS.

STRING
Character string, has a fixed size up to 64 kbytes.

DATE
This field is used to store the date. It takes 3 bytes and actually contains the munber of days from January 1, Year 0000.

POINTER
This field is a pointer to a record in another class. physically it contains the sub- class record number. Size can be from 1-4 bytes. Having pointer makes context somewhat different than traditional DBMS systems.Connection between tables in relational databases is done through the common indexing of two tables with common keys. In this case, the keys must be present in both tables, which means size overhead. To perform search for related records in two tables indexes are needed. In context there is no need to search. The number of a record in a sub class is contained in the field of the main class. (Terms "main class" and "sub class" in the future will only be used for clarity, but in reality all class in context are equal.) Since there are no keys, the performance is much better. In this case it is important to understand that fields are not linked together, only records. The field simple points to a record.

INTEGER, UNSIGNED
These types of fields are designed to store integer. Size may vary from 1 to 8 bytes. 8 bytes holds a very large number. It is often necessary to enter a number with decimal points (dollars and cents). A choice in selecting how precise you wish the number to be is available to you. Example: You many have an integer field that contains 7 bytes long with 2 numbers after the decimal point.

FLOAT, DOUBLE
Four and eight bit numbers with floating point. Size is fixed, but you can pick how precise you want them to be. Output will be limited to the number of digits you pick.

STRUCTURE
It is a static composite field. Actually this isn't a field. It is a package of fields or structure. This field consist of subfields that holds others ConteXt fields or objects. A subfields maybe STRUCTURE too etc. In other words this type of field can create a hierarchy of data inside a record. This opens up a lot of new opportunities in database design and makes some applications much easier to develop.

TEXT, BINARY
(BLOB - Binary Large Object) These are regular fields with a varying size. You can store unlimited size (limited by disk space) text or binary information .

IMAGE
This is a BINARY field clone. Contains GIF or JPEG image.

FILE
This field is similar to TEXT, but contains the name of the file in a file system. Essentially this is a pointer to a file. It is often a better idea to have a pointer to the file instead of copying the file contents into the file.

EXPRESSION
This is another TEXT field clone but contains a program on a certain language. When the output of a true field value is calculated.

COMPLEX
This is a dynamic composite field. The idea is similar to VARIANT but in relation to structures. Field STRUCTURE can contain sub fields, but their number and type is specified during the databse creation, and are therefore constant. What if it is necessary to change the structure of a field depending on the information? That's where COMPLEX type of fields comes in. Basically, it is a combination of a pointer to a sub Class containing the stuctures and a binary field containing the information.

Tuesday, June 30, 2009

MVvs.DataF

Characteristics:Memory Variable vs Data Field

Memory variable

The memory of a computer is organized into a regular pattern of containers for information. These containers for information are called "words". Each one has a numeric address and each one is the same size as each of the others. For most applications, it is inconvenient to refer to portions of memory by their numeric addresses, so programming languages allow us to allocate portions of memory by name. When we store information in the memory of a computer we need to decide on how much we need for various purposes and on how it will be organized. Programming languages provide mechanism for "types" of information in memory. They also provide mechanisms to identify repetitive arrays of items of the same type and to aggregate possibly heterogeneous types under a common name.Definition: A variable is a way of referring to a memory location used in a computer program. This memory location holds values- perhaps numbers or text or more complicated types of data like a payroll record.
Operating Systems load programs into different parts of RAM so there is no way of knowing exactly which memory location will hold a particular variable before the program is run. By giving a variable a symbolic name like "employee_payroll_id" the compiler or interpreter can always work out where to store the variable in memory.
VariableYou will learn more about variable and arrays, these are the smallest components of a programming language.
What is variable?

Every computer has internal memory (read as RAM)
You need to use RAM of a computer writing a Perl program
As shown in following figure memory is series of separate memory cells i.e. RAM is divided into several locations. Each location has got its own address
Each storage location holds a small amount of information
In oder to store or retrieve information from a memory location, you must give that particular location a name. This is commonly know as memory variable
So variable is a character of group of character assigned by the Perl programmer to a single memory location and used in the program as the name of that memory location in order to access the value stored in it (or assign new value to it)
For example in expression Y=10, Y is a name of memory location (variable) where 10 is stored.
Under Perl a variable can be created at anywhere (any point) in the code. Perl variable divided into three basic categories:
Scalar
Array
Hash

Data Field

In the structure of a database, the smallest component under which data is entered through data capture or data entry. All data fields in the same database have unique names, several data fields make up a data record, several data records make up a data file, and several data files make up a database.
A data field is the smallest subdivision of the stored data that can be accessed. A data field can be used to store numerical information such as price, count or a date or time, or even a data and time. A pair of data fields can be used in combination to hold a geo-spatial coordinate. Also, a data field can be used to hold a block of text. A data field takes up permanent storage within the data-store.
The data-store is composed of a number of data records which are, in turn, composed of a number of predefined data fields. Each of these data fields must be defined within the Load Definition File with a unique name.
Variables and Memory
5 distinct areas of memory
--Code space: This is where the executable instructions of the program are kept.
--registers: are part of the CPU that take care of internal housekeeping. Among other things, they contain an identifier that points to the next line of code that is to be executed, and the stack pointer.
--Global Name Space: contains objects allocated by the linker which will persist for the duration of the program.
--Stack: contains local variables, whose persistency is defined by their scope.
--Free Store, or Heap is explicitly created and destroyed by issuing new and delete commands.

Monday, June 22, 2009

_Mycontrast_

Information vs. Data.

• Information is knowledge derived from study, experience (by the senses), or instruction. It is communication intelligence. It is any kind of knowledge that is exchangeable amongst people, about things, facts concepts, etc., in some context." Information is interpreted data.

• Data are facts, statistics used for reference or analysis. Numbers, characters, symbols, images etc., which can be processed by a computer. Data must be interpreted, by human or machine, to derive meaning. It is a representation of information. The Latin word "datum" which means "that which is given. Data is any sort of raw fact.

Data Storage vs. Computer Storage

• Computer storage is the holding of data in an electromagnetic form for access by a computer processor. Primary storage is data in ramdom access memory (RAM) and other "built-in" devices. Secondary storage is data on hard disk, tapes, and other external devices.

• Data Storage device is a storage medium. Most often the term is used with computers. Data storage devices can permanently hold data, like files. Common data strorage devices are: USB flash drives, Hard disk drives, Compact discs, DVDs. A storage device may hold information,process information, or both. A device that only holds information is a recording medium.

Operating system vs. Computer System

• Operating system (commonly abbreviated to either OS or O/S) is an interface between hardware and user; it is responsible for the management and coordination of activities and the sharing of the resources of thr computer. The operating system acts as a host, one of the purposes of an operatting system is to handle the details of the operation of the hardware. This relieves application programs from having to manage these details and makes it easier to write applications. Almost all computers (including handheld computers, desktop computers, super computers, video game consoles) as well as some robots, domestic appliances (dishwashers, washing machines), and portable media players use an operating system of some type. Some of the oldest models may however use an embedded operating system that may be contained on a compact disk or other data storage device.

• Computer system includes not only the computer, but also any software and peripheral devices that are necessary to make the computer function. Evry computer system, for example, requires an operating system.

-= Memory Classifications =-

Two types of memory:

*Primary storage
*Secondary storage

Primary storage:

1. RAM(Random Access Memory)
>>this is power dependent,which means that if there is a power loss, the data that you've just saved will not be found if ever the current gets back. RAM is in a form of 'integrated circuits'.

2. ROM(Read Only Memory)
>>both data and instruction.This RAM is in a form of integrated circuits.

3. Cache memory
Acts as a buffer. It can be found inside the IC's.

4. Virtual memory
-extended memory
-dependent upon hard drive
-temporary
>>it is located inside your hard drive is set on operating system.

Secondary storage

1. Hard disc
>>it runs through a magnetic media
>>it is called 'hard' because it is "metallic"

2. Flash drives

3. CD and DVD's
>>There is also what we called Tertiary memory storage aside from primary and secondary memory storage.

Tertiary Storage
>>these are portables
•flash drive
•external CD-ROM drives
•external drives
•external hard disc- smaller hard disc
•flash memory card
•memory sticks

Sunday, March 1, 2009

♥_dEfinEd_♥

-= tEn difFeRent dEfinitiOns of datA stRuctuRe: =-

1. A means of storing a collection of data. Computer science is in part the study of methods for effectively using a computer to solve problems, or in other words, determining exactly the problem to be solved. This process entails (1) gaining an understanding of the problem; (2) translating vague descriptions, goals, and contradictory requests, and often unstated desires, into a precisely formulated conceptual solution; and (3) implementing the solution with a computer program. This solution typically consists of two parts: algorithms and data structures.

2. Way in which data are stored for efficient search and retrieval. The simplest data structure is the one-dimensional (linear) array, in which stored elements are numbered with consecutive integers and contents are accessed by these numbers. Data items stored non consecutively in memory may be linked by pointers (memory addresses stored with items to indicate where the "next" item or items in the structure are located). Many algorithms have been developed for sorting data efficiently; these apply to structures residing in main memory and also to structures that constitute information systems and databases.

3. A data structure in computer science is a way of storing data in a computer so that it can be used efficiently. It is an organization of mathematical and logical concepts of data. Often a carefully chosen data structure will allow the most efficient algorithm to be used. The choice of the data structure often begins from the choice of an abstract data type. A well-designed data structure allows a variety of critical operations to be performed, using as few resources, both execution time and memory space, as possible. Data structures are implemented by a programming language as data types and the references and operations they provide.

4. It is the interrelationship among data elements that determine how data is recorded, manipulated, stored, and presented by a database.

5. In programming, the term data structure refers to a scheme for organizing related pieces of information. The basic types of data structures include:
files
lists
arrays
records
trees
tables

Each of these basic structures has many variations and allows different operations to be performed on the data.

6. A data structure is a specialized format for organizing and storing data. General data structure types include the array, the file, the record, the table, the tree, and so on. Any data structure is designed to organize data to suit a specific purpose so that it can be accessed and worked with in appropriate ways. In computer programming, a data structure may be selected or designed to store data for the purpose of working on it with various algorithms.

7. Any method of organising a collection of data to allow it to be manipulated effectively. It may include meta data to describe the properties of the structure.
Examples data structures are: array, dictionary, graph,hash, heap, linked list, matrix, object, queue,ring, stack, tree, vector.

8. An organization of information, usually in memory, for better algorithm efficiency, such as queue, stack, linked list, heap, dictionary, and tree, or conceptual unity, such as the name and address of a person. It may include redundant information, such as length of the list or number of nodes in a subtree.

9. A data structure is a way of storing information in a computer so that it can be used efficiently.
Efficiency in this context refers to the ability to find and manipulate data quickly and with the minimum consumption of computer and network resources, mainly CPU (central processing unit) time, memory space and bandwidth.
Numerous types of data structures have been developed; some are very general and widely used, while others are highly specialized for certain types of tasks. Careful selection of data structures can allow the use of the most efficient algorithms for particular tasks and thereby optimize the performance of programs. An algorithm is a precise, unambiguous sets of rules that specify how to solve some problem or perform some task.

10. Data structures can be classified in several ways, including whether they are linear or graphic and whether they are static or dynamic (i.e., whether the shape or size of the structure changes over time). Linear data structures include lists and associative arrays. List data structures can be divided into arrays, linked lists and V lists. Graph data structures include trees, adjacency lists, disjoint-sets, graph-structured stacks and scene graphs. Other data structures include frames, unions, tagged unions and tables.

-= OthEr typEs of daTa stRuctuRe =-

[[..Union (computer science)..]]

In computer science, a union is a data structure that stores one of several types of data at a single location. There are only two safe ways of accessing a union object. One is to always read the field of a union most recently assigned; tagged unions enforce this restriction. The other is to only access functionality common to all types in the union. For example, if the fields are all subtypes of a common supertype, then it is always legal to perform operations on the union object that one can perform on the supertype.
Note: The remainder of this article refers strictly to primitive untagged unions, as opposed to tagged unions.
Because of the limitations of their use, untagged unions are generally only provided in untyped languages or in an unsafe way (as in C). They have the advantage over simple tagged unions of not requiring space to store the tag.
The name "union" stems from the type's formal definition. If one sees a type as the set of all values that type can take on, a union type is simply the mathematical union of its constituting types, since it can take on any value any of its fields can. Also, because a mathematical union discards duplicates, if more than one field of the union can take on a single common value, it is impossible to tell from the value alone which field was last written.

Unions in various programming languages

C/C++

In C and C++, untagged unions are expressed nearly exactly like structures (structs), except that each data member begins at the same location in memory. The data members, as in structures, need not be primitive values, and in fact may be structures or even other unions. However, C++ does not allow for a data member to be any type that has "a non-trivial constructor, a non-trivial copy constructor, a non-trivial destructor, or a non-trivial copy assignment operator." In particular, it is impossible to have the standard C++ string as a member of a union. The union object occupies as much space as the largest member, whereas structures require space equal to at least the sum of the size of its members. This gain in space efficiency, while valuable in certain circumstances, comes at a great cost of safety: the program logic must ensure that it only reads the field most recently written along all possible execution paths.
The primary usefulness of a union is to conserve space, since it provides a way of letting many different types be stored in the same space. Unions also provide crude polymorphism. However, there is no checking of types, so it is up to the programmer to be sure that the proper fields are accessed in different contexts. The relevant field of a union variable is typically determined by the state of other variables, possibly in an enclosing struct.

One common C programming idiom uses unions to perform what C++ calls a reinterpret_cast, by assigning to one field of a union and reading from another, as is done in code which depends on the raw representation of the values. This is not, however, a safe use of unions in general.
Note that the safer tagged unions can be constructed from untagged unions (see tagged union). The safe C dialect Cyclone encourages the preference of tagged unions to untagged.

Structure and union specifiers have the same form. [ . . . ] The size of a union is sufficient to contain the largest of its members. The value of at most one of the members can be stored in a union object at any time. A pointer to a union object, suitably converted, points to each of its members (or if a member is a bit-field, then to the unit in which it resides), and vice versa.

[[..Heap (data structure)..]]

This article is about heap data structures. For “the heap” as a large pool of unused memory, see Dynamic memory allocation.

Example of a full binary max heap

In computer science, a heap is a specialized tree-based data structure that satisfies the heap property: if B is a child node of A, then key(A) ≥ key(B). This implies that an element with the greatest key is always in the root node, and so such a heap is sometimes called a max heap. (Alternatively, if the comparison is reversed, the smallest element is always in the root node, which results in a min heap.) This is why heaps are used to implement priority queues. The efficiency of heap operations is crucial in several graph algorithms.

The operations commonly performed with a heap are:

• delete-max or delete-min: removing the root node of a max- or min-heap, respectively
• increase-key or decrease-key: updating a key within a max- or min-heap, respectively
• insert: adding a new key to the heap
• merge: joining two heaps to form a valid new heap containing all the elements of both.
Heaps are used in the sorting algorithm heapsort.

Heap applications
Heaps are a favorite data structures for many applications.
• Heapsort: One of the best sorting methods being in-place and with no quadratic worst-case scenarios.
• Selection algorithms: Finding the min, max or both of them, median or even any k-th element in sublinear time[citation needed] can be done dynamically with heaps.
• Graph algorithms: By using heaps as internal traversal data structures, run time will be reduced by an order of polynomial. Examples of such problems are Prim's minimal spanning tree algorithm and Dijkstra's shortest path problem.
Interestingly, full and almost full binary heaps may be represented using an array alone. The first (or last) element will contain the root. The next two elements of the array contain its children. The next four contain the four children of the two child nodes, etc. Thus the children of the node at position n would be at positions 2n and 2n+1 in a one-based array, or 2n+1 and 2n+2 in a zero-based array. Balancing a heap is done by swapping elements which are out of order. As we can build a heap from an array without requiring extra memory (for the nodes, for example), heapsort can be used to sort an array in-place.
One more advantage of heaps over trees in some applications is that construction of heaps can be done in linear time using Tarjan's algorithm.

Heap implementations
• The C++ Standard Template Library provides the make_heap, push_heap and pop_heap algorithms for binary heaps, which operate on arbitrary random access iterators. It treats the iterators as a reference to an array, and uses the array-to-heap conversion detailed above.

[[..Octree (data structure)..]]

An octree is a tree data structure in which each internal node has up to eight children. Octrees are most often used to partition a three dimensional space by recursively subdividing it into eight octants. Octrees are the three-dimensional analog of quadtrees. The name is formed from oct + tree, and normally written "octree", not "octtree".

Octrees for spatial representation

Each node in an octree subdivides the space it represents into eight octants. In a point region (PR) octree, the node stores an explicit 3-dimensional point, which is the "center" of the subdivision for that node; the point defines one of the corners for each of the eight children. In an MX octree, the subdivision point is implicitly the center of the space the node represents. The root node of a PR octree can represent infinite space; the root node of an MX octree must represent a finite bounded space so that the implicit centers are well-defined. Octrees are never considered kD-trees, as kD-trees split along a dimension and octrees split around a point. kD-trees are also always binary, which is not true of octrees.

Common uses of octrees

• Spatial indexing
• Efficient collision detection in three dimensions
• View frustum culling
• Fast Multipole Method

Application to color quantization

The octree color quantization algorithm, invented by Gervautz and Purgathofer in 1988, encodes image color data as an octree up to nine levels deep. Octrees are used because 23 = 8 and there are three color components in the RGB system. The node index to branch out from at the top level is determined by a formula that uses the most significant bits of the red, green, and blue color components, e.g. 4r + 2g + b. The next lower level uses the next bit significance, and so on. Less significant bits are sometimes ignored to reduce the tree size.

The algorithm is highly memory efficient because the tree's size can be limited. The bottom level of the octree consists of leaf nodes that accrue color data not represented in the tree; these nodes initially contain single bits. If much more than the desired number of palette colors are entered into the octree, its size can be continually reduced by seeking out a bottom-level node and averaging its bit data up into a leaf node, pruning part of the tree. Once sampling is complete, exploring all routes in the tree down to the leaf nodes, taking note of the bits along the way, will yield approximately the required number of colors.

Thursday, February 5, 2009

itS datA sTructurE(aRrayS)

-= ArrayS =-

As we all know that array is a series of elements of the same type placed in contiguous memory locations that can be individually referenced by adding an index to a unique identifier.The data 1,data 2,data 3,data 4,this is placed inside the array billy where it is performed.the illustration of this what we call billy in arrays is like what we see in the Excel,there are rows and columns.Where each blank panel represents an element of the array, that in this case are integer values of type int. These elements are numbered from 0 to 4 since in arrays the first index is always 0, independently of its length. we can access the value of any of its elements individually as if it was a normal variable, thus being able to both read and modify its value. In creating arrays,The New clause must specify the type name, followed by parentheses, followed by braces, {}. The parentheses do not represent a call to an array constructor. Instead, they indicate that the object type is an array type. The braces supply initialization values. The compiler requires the braces even if you are not supplying any values. Therefore, the New clause must include both the parentheses and the braces, even if they are empty.

-= An array of ten elements: =-

Each item in an array is called an element, and each element is accessed by its numerical index. As shown in the above illustration, numbering begins with 0. The 9th element, for example, would therefore be accessed at index 8. each element maps to a unique memory location (that is, no two entries in the array occupy the same memory locations) and (2) the mapping is consistent. That is, a given element in the array always maps to the same memory location. So what you really need is a function with two input parameters (row and column) that produces an offset into a linear array of sixteen memory locations. each element maps to a unique memory location (that is, no two entries in the array occupy the same memory locations) and (2) the mapping is consistent.

♥_ mE and oNLy mEh_♥