Prerequisite: In order to follow this tutorial and do hands-on, make sure you have installed Apache Cassandra and its up & running at least locally on one node. I have placed Apache Cassandra at
"/Users/n0r0082/opt/packages/cassandra/apache-cassandra-2.1.1".
The Apache Cassandra installation includes the cqlsh utility, a python-based command line client for executing Cassandra Query Language (CQL) commands.
Before going into details of database operations lets familiarise with Cassandra terminologies :
Start cqslh utility present in bin directory of Cassandra Installation which in turn gives cqlsh prompt.
Create a keyspace : Create a keyspace named as "OCPorc" with replication strategy attributes. On successful execution prompt does not return any status. We can see it using describe keyspaces.
List keySpaces : Below describe command lists all keyspaces present in given cluster node.
Selecting a keyspace : Since column families (like table in RDMS) are created in context of keyspace, we need to select keyspace we want to work with.
Create Table(Column Family) in selected Keyspace: Create table name USERS with username as primary key and other associated properties like boolean_filter, grace_seconds, etc. In Cassandra terminology Table are termed as Column Family. Instead of using "Create TABLE USERS ....." keyword, we can use "CREATE COLUMNFAMILY USERS ..... "
Insert data in USERS Table(Column Family) : Below we insert 3 rows in USERS table. On successful execution of INSERT statement we do not see any response in the shell whatsoever; it should simply provide us with a new command prompt.
Note: We can insert columns values partially. i.e: Some fields can be empty, like below command insert one row but phone field is empty.
Select rows from USERS table(Column Family): SELECT command is same as used in RDBMS SQL. it returns all 4 records, notice phone is null in row with username "Rahul"
Other variations of SELECT Query:
Below query shows all above use case. When we execute IN query with non-primary key it throw error as "not supported".
Delete columns data : Execute Delete Query and we can notice difference in terms of email value in first row.
Note: In Delete command we cannot use non-primary key. Below query executions fails because where clause using "City", non-primary key
Update table row:
Note: Update command also expect primary key in WHERE clause ,below command execution failed because of non-primary key column "city"
Note: Using Secondary index we can avoid primary key requirement with WHERE clause.
Create Index on USERS : Below command create an index based on column name city
It is recommended by Datastax to avoid Secondary Index. Secondary indexes are tricky to use and can impact performance greatly. The index table is stored on each node in a cluster, so a query involving a secondary index can rapidly become a performance nightmare if multiple nodes are accessed.
==== ***** =====
"/Users/n0r0082/opt/packages/cassandra/apache-cassandra-2.1.1".
The Apache Cassandra installation includes the cqlsh utility, a python-based command line client for executing Cassandra Query Language (CQL) commands.
Before going into details of database operations lets familiarise with Cassandra terminologies :
- Cluster - It is collection of nodes(data centers) arranged in a ring architecture.
- Keyspace - The keyspace is the outermost container for data in Cassandra. The main attributes to keyspace are the Replication Factor, the Replica Placement Strategy and the Column Families. Relational database schema is analogous to keyspace in Cassandra.
- Column Family - Column Families in Cassandra are like tables in Relational Databases. Each Column Family contains a collection of rows. The key gives the ability to access related data together,
- Column- A column in Cassandra is a data structure which contains a column name, a value and a timestamp.
Start cqslh utility present in bin directory of Cassandra Installation which in turn gives cqlsh prompt.
➜ ~ cd /Users/n0r0082/opt/packages/cassandra/apache-cassandra-2.1.1/bin ➜ bin ./cqlsh Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 2.1.1 | CQL spec 3.2.0 | Native protocol v3] Use HELP for help. cqlsh>
Create a keyspace : Create a keyspace named as "OCPorc" with replication strategy attributes. On successful execution prompt does not return any status. We can see it using describe keyspaces.
cqlsh:my_status> CREATE KEYSPACE "OCProc" ... WITH REPLICATION = { ... 'class': 'SimpleStrategy', 'replication_factor': 1 ... };
List keySpaces : Below describe command lists all keyspaces present in given cluster node.
cqlsh:my_status> describe keyspaces; system_traces my_status "OCProc" system
Selecting a keyspace : Since column families (like table in RDMS) are created in context of keyspace, we need to select keyspace we want to work with.
cqlsh:my_status> use "OCProc" ; cqlsh:OCProc>
Create Table(Column Family) in selected Keyspace: Create table name USERS with username as primary key and other associated properties like boolean_filter, grace_seconds, etc. In Cassandra terminology Table are termed as Column Family. Instead of using "Create TABLE USERS ....." keyword, we can use "CREATE COLUMNFAMILY USERS ..... "
cqlsh:OCProc> CREATE TABLE "USERS" ( ... "username" text PRIMARY KEY, ... "email" text, ... "city" text, ... "phone" varint, ... "encrypted_password" blob ... ) WITH bloom_filter_fp_chance = 0.01 ... AND comment = 'Creating USERS Tabel to store users details' ... AND dclocal_read_repair_chance = 0.1 ... AND default_time_to_live = 0 ... AND gc_grace_seconds = 864000 ... AND max_index_interval = 2048 ... AND memtable_flush_period_in_ms = 0 ... AND min_index_interval = 128 ... AND read_repair_chance = 0.0 ... AND compaction = {'min_threshold': '4', 'class': ... 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', ... 'max_threshold': '32'} ... AND compression = {'sstable_compression': ... 'org.apache.cassandra.io.compress.LZ4Compressor'} ... AND speculative_retry = '99.0PERCENTILE'; cqlsh:OCProc> describe tables; "USERS"
Insert data in USERS Table(Column Family) : Below we insert 3 rows in USERS table. On successful execution of INSERT statement we do not see any response in the shell whatsoever; it should simply provide us with a new command prompt.
cqlsh:OCProc> INSERT INTO "USERS" ... ("username", "email", "city", "phone", "encrypted_password") ... VALUES ( ... 'zytham', ... 'zytham@gmail.com', ... 'Patna', ... 9999888800, ... 0x9792977ed729792e403da53024c6069a9158b8c4 ... ); cqlsh:OCProc> INSERT INTO "USERS" ... ("username", "email", "city", "phone", "encrypted_password") ... VALUES( ... 'ranjan', ... 'ranjan@gmail.com', ... 'Bangalore', ... 678998800, ... 0x8914977ed729792e403da53024c6069a9158b8c4 ... ); cqlsh:OCProc> cqlsh:OCProc> INSERT INTO "USERS" ... ("username", "email", "city", "phone", "encrypted_password") ... VALUES( ... 'mishra', ... 'zytham@gmail.com', ... 'Torento', ... 00980099766, ... 0x891497745729792e403da53024c6069a9158b8c4 ... );
cqlsh:OCProc> INSERT INTO "USERS" ... ("username", "email", "city", "encrypted_password") ... VALUES ( ... 'Rahul', ... 'rahul@gmail.com', ... 'Mumbai', ... 0x9792977ed729792e403da53024c6069a9158b8c4 ... );
Select rows from USERS table(Column Family): SELECT command is same as used in RDBMS SQL. it returns all 4 records, notice phone is null in row with username "Rahul"
cqlsh:OCProc> SELECT * FROM "USERS" ... ; username | city | email | encrypted_password | phone ----------+-----------+------------------+--------------------------------------------+------------ zytham | Patna | zytham@gmail.com | 0x9792977ed729792e403da53024c6069a9158b8c4 | 9999888800 ranjan | Bangalore | ranjan@gmail.com | 0x8914977ed729792e403da53024c6069a9158b8c4 | 678998800 Rahul | Mumbai | rahul@gmail.com | 0x9792977ed729792e403da53024c6069a9158b8c4 | null mishra | Torento | zytham@gmail.com | 0x891497745729792e403da53024c6069a9158b8c4 | 980099766 (4 rows)
Other variations of SELECT Query:
- List specific columns in result-set
- List all rows whose primary key provided
- Pagination query using LIMIT
Below query shows all above use case. When we execute IN query with non-primary key it throw error as "not supported".
cqlsh:OCProc> SELECT "username", "email" FROM "USERS" WHERE "city" = 'Patna' ; username | email ----------+------------------ zytham | zytham@gmail.com (1 rows) cqlsh:OCProc> SELECT * FROM "USERS" WHERE "username" IN ('zytham', 'mishra'); username | city | email | encrypted_password | phone ----------+---------+------------------+--------------------------------------------+------------ zytham | Patna | zytham@gmail.com | 0x9792977ed729792e403da53024c6069a9158b8c4 | 9999888800 mishra | Torento | zytham@gmail.com | 0x891497745729792e403da53024c6069a9158b8c4 | 980099766 (2 rows) cqlsh:OCProc> SELECT * FROM "USERS" WHERE "city" IN ('Patna', 'Torento'); code=2200 [Invalid query] message="IN predicates on non-primary-key columns (city) is not yet supported" cqlsh:OCProc> SELECT * FROM "USERS" LIMIT 2; username | city | email | encrypted_password | phone ----------+-----------+------------------+--------------------------------------------+------------ zytham | Patna | zytham@gmail.com | 0x9792977ed729792e403da53024c6069a9158b8c4 | 9999888800 ranjan | Bangalore | ranjan@gmail.com | 0x8914977ed729792e403da53024c6069a9158b8c4 | 678998800 (2 rows)
Delete columns data : Execute Delete Query and we can notice difference in terms of email value in first row.
cqlsh:OCProc> SELECT * FROM "USERS"; username | city | email | encrypted_password | phone ----------+-----------+------------------+--------------------------------------------+------------ zytham | Patna | zytham@gmail.com | 0x9792977ed729792e403da53024c6069a9158b8c4 | 9999888800 ranjan | Bangalore | ranjan@gmail.com | 0x8914977ed729792e403da53024c6069a9158b8c4 | 678998800 Rahul | Mumbai | rahul@gmail.com | 0x9792977ed729792e403da53024c6069a9158b8c4 | null mishra | Torento | zytham@gmail.com | 0x891497745729792e403da53024c6069a9158b8c4 | 980099766 cqlsh:OCProc> DELETE "email" FROM "USERS" WHERE "username"= 'zytham'; cqlsh:OCProc> SELECT * FROM "USERS"; username | city | email | encrypted_password | phone ----------+-----------+------------------+--------------------------------------------+------------ zytham | Patna | null | 0x9792977ed729792e403da53024c6069a9158b8c4 | 9999888800 ranjan | Bangalore | ranjan@gmail.com | 0x8914977ed729792e403da53024c6069a9158b8c4 | 678998800 Rahul | Mumbai | rahul@gmail.com | 0x9792977ed729792e403da53024c6069a9158b8c4 | null mishra | Torento | zytham@gmail.com | 0x891497745729792e403da53024c6069a9158b8c4 | 980099766 (4 rows)
Note: In Delete command we cannot use non-primary key. Below query executions fails because where clause using "City", non-primary key
cqlsh:OCProc> DELETE "email" FROM "USERS" WHERE "city"= 'Patna'; code=2200 [Invalid query] message="Non PRIMARY KEY city found in where clause"
Update table row:
cqlsh:OCProc> UPDATE "USERS" SET "city"='Delhi' WHERE "username" = 'zytham' ; cqlsh:OCProc> SELECT * FROM "USERS"; username | city | email | encrypted_password | phone ----------+-----------+------------------+--------------------------------------------+------------ zytham | Delhi | null | 0x9792977ed729792e403da53024c6069a9158b8c4 | 9999888800 ranjan | Bangalore | ranjan@gmail.com | 0x8914977ed729792e403da53024c6069a9158b8c4 | 678998800 Rahul | Mumbai | rahul@gmail.com | 0x9792977ed729792e403da53024c6069a9158b8c4 | null mishra | Torento | zytham@gmail.com | 0x891497745729792e403da53024c6069a9158b8c4 | 980099766 (4 rows)
Note: Update command also expect primary key in WHERE clause ,below command execution failed because of non-primary key column "city"
cqlsh:OCProc> UPDATE "USERS" SET "city"='Delhi' WHERE "city" = 'Patna' ; code=2200 [Invalid query] message="Non PRIMARY KEY city found in where clause"
Note: Using Secondary index we can avoid primary key requirement with WHERE clause.
Create Index on USERS : Below command create an index based on column name city
cqlsh:OCProc> CREATE INDEX emp_city_idx ON "USERS" (city);
It is recommended by Datastax to avoid Secondary Index. Secondary indexes are tricky to use and can impact performance greatly. The index table is stored on each node in a cluster, so a query involving a secondary index can rapidly become a performance nightmare if multiple nodes are accessed.
==== ***** =====
Aivivu đại lý vé máy bay, tham khảo
ReplyDeleteve may bay di my gia re
chuyến bay cứu trợ mỹ về việt nam
vé từ nhật về việt nam
vé máy bay từ đức về việt nam giá rẻ
vé máy bay từ canada về việt nam
Lịch bay từ Hàn Quốc về Việt Nam tháng 7
khách sạn cách ly ở đà nẵng