Avoid row-by-row scanning


Link to this posting

Postby Ursego » 19 Feb 2013, 22:28

Don't process DataWindows and DataStores row by row if the task can be accomplished in another, more efficient way.

Find and process rows which satisfy a criteria

Use Find() in a loop rather than compare values in each row:

*** BAD code: ***

Code: Select all
ll_row_count = dw_emp.RowCount()
for ll_row = 1 to ll_row_count
       ll_curr_dept_id = dw_emp.object.dept_id[ll_row]
       if ll_curr_dept_id <> al_dept_id then continue
       // process the found row
next

*** GOOD code: ***

Code: Select all
ll_row = 0
ll_row_count = dw_emp.RowCount()
ls_search_expr = "dept_id=" + String(al_dept_id)
do while true
       ll_row = dw_emp.Find(ls_search_expr, ll_row + 1, ll_row_count)
       if ll_row = 0 then exit
       // process the found row
       if ll_row = ll_row_count then exit // prevent eternal loop when the last row satisfies the search condition
loop

In many situations (some of which are described below), the function Describe("Evaluate('<EXPRESSION>, 1, 0) for all', 1)") is very useful if you want to avoid scanning the DW in a loop with a comparison in each row.

Count rows which satisfy a criteria

Code: Select all
ll_active_emp_count = Long(dw_emp.Describe("Evaluate('Sum(if(status = ~"A~", 1, 0) for all)', 1)"))
MessageBox("HR", "There are " + String(ll_active_emp_count) + " active employees.")

Maximum value of the field in all rows

Code: Select all
ld_latest_order_date = Date(ids_order.Describe("Evaluate('Max(order_date)', 0)"))
MessageBox("Orders", "The last order was made in " + String(ld_latest_order_date, '"MMM DD, YYYY") + ".")

Count highlighted rows

Code: Select all
ll_selected_count = Long(dw_emp.Describe("Evaluate('Sum(if(IsSelected(), 1, 0) for all)', 1)"))
if ll_selected_count < 2 then MessageBox("New Team", "Please select at least 2 employees.")

Define if the field has a duplicated value (i.e. is not unique)

Code: Select all
dw_order_status.SetSort('order_status A')
dw_order_status.Sort()
dw_order_status.GroupCalc()
lb_duplicate_exists = ("1" = dw_order_status.Describe("Evaluate('Max(if(GetRow() <> 1 AND order_status[-1] = order_status, 1, 0))', 0)"))
if lb_duplicate_exists then MessageBox("Error", "Order Statuses must be unique.")

Put the same value in the field in all the rows

If you need to assign a same value to a field in ALL the rows (like a coefficient another field should be multiplied or divided by), use a computed field instead of assigning the value to the field in each row in a loop. For that, make that field computed, with a very simple expression "1", and change that expression programmatically by simply making the value the field's expression.

*** BAD code: ***

Suppose, a variable ll_coef_to_divide contains the result of a calculation in PB code. Here is the inefficient solution (assuming that the field is not computed but exists in the DW's data source):

Code: Select all
for ll_row = 1 to ll_row_count
       dw_XXX.object.coef_to_divide[ll_row] = ll_coef_to_divide
next


*** GOOD code: ***

To make the assignment at one stroke, the field should be computed. The value, which the field will return, is assigned this way:

Code: Select all
dw_XXX.object.c_coef_to_divide.Expression = String(ll_coef_to_divide)
dw_XXX.GroupCalc() // recalc other computed fields which mention c_coef_to_divide in their expressions
User avatar
Ursego
Site Admin
 
Posts: 143
Joined: 19 Feb 2013, 20:33



Ketones are a more high-octane fuel for your brain than glucose. Become a biohacker and upgrade yourself to version 2.0!



cron
Traffic Counter

eXTReMe Tracker