Thursday, November 26, 2020

Using GDB in simulation runs to debug C-code

When we run simulations with C-models (along with RTL and SV-UVM TB ) we might want to debug the C-code to identify issues.

Here is a list of commands for GDB ( for VCS )

  1. gdb <SIMV absolute path>.
  2. break <File_name>:<Line_number>
  3. run <Simulation command line arguments>

These will allow you to to run the simulation until the debug point.

  1. For executing line by line we can use <next>
  2. For step, we can use <step>
  3. To continue till next occurrence of  break point use <continue >
  4. To print any variable with (print <var>)



Tuesday, November 17, 2020

Stuck at 0ns

Yesterday, I had an issue in one of my simulations, were the run was stuck at 0ns.Upon debug, I found an issue in the code. Although it looks obvious it takes quite a while to figure these out... Isolating the problem ....

module top;
  bit [2:0] cnt;
  initial begin
   for(cnt = 0; cnt <8;cnt++) $display("Cnt:%0d",cnt);
 end
endmodule

Look through the code, and we might feel it just prints 0...7 and the simulation stops.Once we run the code we realize that it gets stuck in an infinite loop.Reason is simple, the 'cnt' variable is a 3 bit variable, when the loop reaches 7 the 'cnt' is incremented and gets rounded of to 0, once again starting the loop , and this goes on and on.


 

Saturday, November 7, 2020

System verilog constraints and OOPs

Does constraints of a parent class apply when child class handle is passed on to the later?

Code : 

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
class parent;
  rand int a;
  rand int b;
  constraint c_a { a == 10; }
  constraint c_b { b == 100; }
  
  function void display();
    $display("Parent - A - %0d B - %0d ",a,b);
  endfunction
endclass: parent

class child extends parent;
  rand int a;
  constraint c_a { a == 20; }
  
  virtual function void display();
    $display("Child - A - %0d B - %0d ",a,b);
  endfunction
endclass

class grand_child extends child;
  
  constraint c_a { a == 30; }
  
  function void display();
    $display("Grand Child - A - %0d B- %0d ",a,b);
  endfunction
endclass


module top;
  parent      p;
  child       c;
  grand_child gc;
  
  initial begin
    p = new;
    c = new;
    gc= new;
    
    p = gc;    
    
    void'(p.randomize());
    void'(c.randomize());
    void'(gc.randomize());
    p.display();
    c.display();
    gc.display();
  end
    
  
endmodule

 

Here we have 2 cases, 

1. Case - 1 : Grand child handle 'gc' is assigned to parent handle 'p'. We have not used virtual key word in display function in parent, therefore limiting the scope to the parent int variables.

2. Case - 2 : Using virtual keyword in parent class function display(); 

Observation : 

  • You can see from the print logs, that constraint of parent class on integer 'a' is not applied. we see a random integer value displayed instead of 10.
  • Integer 'b' has constraint applied and therefore you see 100 in the print statement.
  • Reason : since parent handle now contains the handle of grand child, the handling of both constraints differ, a is overridden ( in child class you have 'int a' re-declared )
  • What if we use virtual keyword before display ? Now the scope of the display function is in grand child class, and there fore you can see 30 - 100
  • What if you remove 'int a' in child class?

Without virtual keyword ( in parent display function ):

Compiler version Q-2020.03-SP1-1; Runtime version Q-2020.03-SP1-1; Nov 7 23:53 2020
Parent - A - -1360295855 B - 100
Child - A - 20 B - 100
Grand Child - A - 30 B- 100
V C S S i m u l a t i o n R e p o r t  

With virtual keyword ( in parent display function )

Compiler version Q-2020.03-SP1-1; Runtime version Q-2020.03-SP1-1; Nov 7 23:55 2020
Grand Child - A - 30 B- 100
Child - A - 20 B - 100
Grand Child - A - 30 B- 100
V C S S i m u l a t i o n R e p o r t  

Removing 'int a' in child class ( keep the constraint )

Compiler version Q-2020.03-SP1-1; Runtime version Q-2020.03-SP1-1; Nov 8 00:15 2020
Parent - A - 30 B - 100
Child - A - 20 B - 100
Grand Child - A - 30 B- 100
V C S S i m u l a t i o n R e p o r t

Thursday, October 29, 2020

CTAG setup for system verilog and c++

CTAGS are an easy way to navigate through code hierarchy. By default, system verilog is not supported in CTAGS. In order to add the language support, we need to write a file .ctag which covers all keywords in SV.

Procedure :

  1. Make sure you place the .ctag file in $HOME directory. 
  2. Inside the .vimrc, add the following line >> set tags=tags;  
  3. Go the project directory and run the following code in the shell, >> ctags -R *. This  generates a tags file in the directory.  
  4. Use ctrl+] to navigate to the file definition, and ctrl+o to get back to the initial file/line. >> man ctags for more info.
  5. That's it!!! We are good to go.

 
.ctags file information : 

--exclude=.SOS
--exclude=.git
--exclude=nobackup
--exclude=nobkp

--langdef=systemverilog
--langmap=systemverilog:.v.vg.sv.svh.tv.vinc

--regex-systemverilog=/^\s*(\b(static|local|virtual|protected)\b)*\s*\bclass\b\s*(\b\w+\b)/\3/c,class/
--regex-systemverilog=/^\s*(\b(static|local|virtual|protected)\b)*\s*\btask\b\s*(\b(static|automatic)\b)?\s*(\w+::)?\s*(\b\w+\b)/\6/t,task/
--regex-systemverilog=/^\s*(\b(static|local|virtual|protected)\b)*\s*\bfunction\b\s*(\b(\w+)\b)?\s*(\w+::)?\s*(\b\w+\b)/\6/f,function/

--regex-systemverilog=/^\s*\bmodule\b\s*(\b\w+\b)/\1/m,module/
--regex-systemverilog=/^\s*\bprogram\b\s*(\b\w+\b)/\1/p,program/
--regex-systemverilog=/^\s*\binterface\b\s*(\b\w+\b)/\1/i,interface/
--regex-systemverilog=/^\s*\btypedef\b\s+.*\s+(\b\w+\b)\s*;/\1/e,typedef/
--regex-systemverilog=/^\s*`define\b\s*(\w+)/`\1/d,define/
--regex-systemverilog=/}\s*(\b\w+\b)\s*;/\1/e,typedef/

--regex-systemverilog=/^\s*(\b(static|local|private|rand)\b)*\s*(\b(shortint|int|longint)\b)\s*(\bunsigned\b)?(\s*\[.+\])*\s*(\b\w+\b)/\7/v,variable/
--regex-systemverilog=/^\s*(\b(static|local|private|rand)\b)*\s*(\b(byte|bit|logic|reg|integer|time)\b)(\s*\[.+\])*\s*(\b\w+\b)/\6/v,variable/
--regex-systemverilog=/^\s*(\b(static|local|private)\b)*\s*(\b(real|shortreal|chandle|string|event)\b)(\s*\[.+\])*\s*(\b\w+\b)/\6/v,variable/
--regex-systemverilog=/(\b(input|output|inout)\b)?\s*(\[.+\])*\s*(\b(wire|reg|logic)\b)\s*(\[.+\])*\s*(#(\(.+\)|\S+)\))?\s*(\b\w+\b)/\9/v,variable/
--regex-systemverilog=/(\b(parameter|localparam)\b).+(\b\w+\b)\s*=/\3/a,parameter/

--systemverilog-kinds=+ctfmpied

--languages=systemverilog,C,C++,HTML,Lisp,Make,Matlab,Perl,Python,Sh,Tex

Sunday, August 23, 2020

Driving chunks of data using verilog classes

 Problem Statement :

  • Consider a DUT which asks for say n chunks of data (n <=200 bytes). 
  • The driver needs to drive these chunks on the interface. 
  • Write a transaction class which randomly generates these chunks of data. 
  • Each chunk of data has to be a collection of beats where the size of all beats except the last one is a multiple of 4 and maximum beat size is 64.

Code :

 

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
class chunk_data #(parameter N =200);

  rand bit [7:0] data[$];
  rand bit [7:0] chunk_size[$];

  constraint data_size  { data.size == N; }
  constraint chuuk_size { chunk_size.size inside {[3:25]}; 
                          int'(chunk_size.sum) == N-(N%4);
                          foreach(chunk_size[i]) { 
                            chunk_size[i]%4 == 0; 
                            chunk_size[i] inside {[4:64]}; }
                        }

  function void post_randomize();
    int diff_chunk;
    diff_chunk = N - chunk_size.sum();
    if(diff_chunk != 0) chunk_size.push_back(diff_chunk);
  endfunction: post_randomize

  function void display();
    $display("==============================");
    foreach(chunk_size[i])
    $display("Chunk_Size:%0d",chunk_size[i]);
  endfunction: display
endclass: chunk_data


module top;
  parameter N = 199;
  chunk_data#(N) c;

  initial begin
    c = new;
    repeat(2) begin
      void'(c.randomize());
      c.display();
    end
  end
endmodule: top

 

Results :

Compiler version L-2016.06; Runtime version L-2016.06;  Aug 23 22:38 2020
==============================
Chunk_Size:4
Chunk_Size:4
Chunk_Size:24
Chunk_Size:4
Chunk_Size:20
Chunk_Size:4
Chunk_Size:4
Chunk_Size:36
Chunk_Size:20
Chunk_Size:24
Chunk_Size:4
Chunk_Size:4
Chunk_Size:36
Chunk_Size:8
Chunk_Size:3
==============================
Chunk_Size:56
Chunk_Size:4
Chunk_Size:20
Chunk_Size:8
Chunk_Size:20
Chunk_Size:8
Chunk_Size:4
Chunk_Size:52
Chunk_Size:24
Chunk_Size:3
           V C S   S i m u l a t i o n   R e p o r t
Time: 0
CPU Time:      1.780 seconds;       Data structure size:   0.0Mb
 

 

Saturday, August 22, 2020

Using System Verilog Constraints judiciously

System verilog constraints take up lot of simulation time if not used properly.

Let me try to illustrate it with an example.

Take a scenario where we need to generate unique address ( for burst traffic ) each time we randomize.

Configuration is as follows....

1. Page - 4096 ( 4K) 

2. Address range - 63: 0

3. Align signal is used to take either offset range ( range starts from offset value to the end of the page )

or to take the whole page as a burst ( 4096 in our case ).

align == 1 -> no page offset

align == 0 -> some page offset

Since the address is 64 bits and Page is 12 bits, I took a variable page_num ( 64 -12 = 52 sized vector ) to represent the pages that can be addressed with the address range ( 63:0 ).

Lets take 2 approaches to prove our point ( to pick the random address )

1. With Constraints

2. Without using constraints  


CASE 1 :

In this case, we simply used a queue to push all the used page numbers in post randomize.

For constraints we used unique keyword to make sure that we do not repeat the same value.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
class address_selector;
  rand bit [51:0] page_num;
  rand bit [11:0] page_offset;
  rand bit align;
  bit [51:0] used_pages[$];
  constraint c_addr {
    unique { used_pages, page_num }; 
  }
  constraint c_align {
    solve align before page_offset;
    (align == 1) -> page_offset == 0;
  }
  function void post_randomize();
    used_pages.push_back(page_num);
  endfunction: post_randomize
endclass: address_selector

module top;
  address_selector as;
  
  int transfer_size = 1024 * 1000; // Size of 1000K
  initial begin
    as = new;
    while (transfer_size > 0) begin
      void'(as.randomize());
      transfer_size = transfer_size - (4096 - as.page_offset);
      $display("TRANSFER SIZE:%0x ALIGN:%0d ADDR:%0x PAGE:%0x OFFSET:%0x",transfer_size,as.align, {as.page_num,as.page_offset},as.page_num,as.page_offset);
    end
  end
endmodule

Simulation results:

V C S   S i m u l a t i o n   R e p o r t
Time: 0
CPU Time:    143.710 seconds;       Data structure size:   0.0Mb
Sat Aug 22 04:39:05 2020
CPU time: .134 seconds to compile + .023 seconds to elab + .392 seconds to link + 144.630 seconds in simulation

 

CASE 2:

In this case, we used an associative array instead of  queue.

We removed the constraint for generating unique page numbers, instead we used logic to look for the address used ( page_num) inside the post_randomize() function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
class address_selector;
  rand bit [51:0] page_num;
  rand bit [11:0] page_offset;
  rand bit align;
  bit [51:0] used_pages[*];
  constraint c_align {
    solve align before page_offset;
    (align == 1) -> page_offset == 0;
  }
  function void post_randomize();
    while(used_pages.exists(page_num)) begin
      std::randomize(page_num);
    end
    used_pages[page_num] = 1;
  endfunction: post_randomize
endclass: address_selector

module top;
  address_selector as;
  
  int transfer_size = 1024 * 1000;// 128 * 1024 * 100000;
  initial begin
    as = new;
    while (transfer_size > 0) begin
      void'(as.randomize());
      transfer_size = transfer_size - (4096 - as.page_offset);
      $display("TRANSFER SIZE:%0x ALIGN:%0d ADDR:%0x PAGE:%0x OFFSET:%0x",transfer_size,as.align, {as.page_num,as.page_offset},as.page_num,as.page_offset);
    end
  end
endmodule


Simulation results :

  V C S   S i m u l a t i o n   R e p o r t
Time: 0
CPU Time:      0.430 seconds;       Data structure size:   0.0Mb
Sat Aug 22 04:39:35 2020
CPU time: .141 seconds to compile + .022 seconds to elab + .460 seconds to link + .568 seconds in simulation


Conclusion :

See the difference 143 seconds vs .430 seconds.

For smaller address ranges, it doesn't make much of a difference, but for larger footprints it does show a significant improvement in the simulation times.

 


Tuesday, August 18, 2020

System Verilog Constraints for non-overlapping memory allocation

 class ex;
 
  parameter MAX = 1024;
 
  rand int unsigned max_val[4];
  rand int unsigned min_val[4];
  rand int unsigned rng_val[4];
 
  constraint c_min_max {
    rng_val.sum() <= 1024;
    foreach(rng_val[i]) {
      max_val[i] inside { [0:MAX-1] };
      min_val[i] inside { [0:MAX-1] };
      rng_val[i] inside { [1:MAX] };
      max_val[i] == min_val[i] + rng_val[i]-1;
      if(i > 0) min_val[i] > max_val[i-1];
    }
  }
 

 
  function void post_randomize();
    foreach(rng_val[i])
    $display("MAX:%0d | MIN:%0d | RNG:%0d",max_val[i],min_val[i],rng_val[i]);
  endfunction  
 
endclass

module top;
  ex e;
 
  initial begin
    e = new;
    void'(e.randomize());
  end
endmodule

 

Simulation:

ompiler version P-2019.06-1; Runtime version P-2019.06-1; Aug 18 06:16 2020
MAX:144 | MIN:20 | RNG:125
MAX:271 | MIN:151 | RNG:121
MAX:457 | MIN:274 | RNG:184
MAX:574 | MIN:459 | RNG:116

Monday, August 17, 2020

Use System verilog constraints to Divide data of N into chunks of M

 Problem Statement:

You need to come up with the system verilog class < sequence item > which can generate or divide a large size of data N into smaller pieces of equal size, these small sized chunks have to be multiple of 4 - M. Maximum size is 64 and minimum is 4.

The left over data if it does not sum up to N, can be added last which need not match M.


Solution:

class chunks;
  int unsigned c[$];
  parameter SIZE = 200;
  rand bit [6:0] s;
 
  constraint c_size {
    s%4 == 0;
  }
 
  function void post_randomize();
    int tmp;
    tmp = SIZE - (SIZE/s * s); // If chunks doesnot add up to SIZE, collect the difference
   
    for(int i=0; i < (SIZE/s);i++) c.push_back(s);
    if(tmp !=0) c.push_back(tmp); // If non zero tmp is observed, simply add to the queue.
     
   
  endfunction
 
  function void display();
    $display("Size selected:%0d",s);
    $display("Queue value:%p",c);
  endfunction: display
endclass
 
 
  module top;
    chunks cs;
   
    initial begin
      cs = new;
      void'(cs.randomize());
      cs.display();
    end
  endmodule

Sunday, August 9, 2020

System verilog constraints interview question involving multiple variables

 

Sequence item is as follows:

rand unique_bit 

rand num_of_reqs;

rand Bit [10:0] x [];

rand Bit[10:0] y[];

rand Bit [10:0] width[];      

rand Bit [10:0] height[];

rand bit [10:0]  frame_width;

rand bit [10:0]  frame_height;

 

Conditions for constraints.....

  1. each request is combination of x,y, width & height

  2. x+width must be less than or equal to frame width

  3. y+height must be less than or equal to  frame height

  4. if unique bit is set , combination of x,y,w,h must not be equal to any of other x,y,w,h

 

Code::

 

class test;

  rand bit unique_bit;

  rand int unsigned num_of_reqs;

  rand bit [10:0] x[];

  rand bit [10:0] y[];

  rand bit [10:0] w[];      

  rand bit [10:0] h[];

  rand bit [10:0] frame_width;

  rand bit [10:0] frame_height;


  constraint c_num_reqs {

    num_of_reqs inside {[1:5]};

    x.size() == num_of_reqs;

    y.size() == num_of_reqs;

    w.size() == num_of_reqs;

    h.size() == num_of_reqs;

  }

 

  constraint c_frame_width {

    frame_width inside {[0:1023]}; // Constraint will fail , if you don't cap your width

    foreach (x[i]) {

      int'(x[i] + w[i])<= frame_width;

      x[i] inside {[0:frame_width]};

      w[i] inside {[0:frame_width]};

     }

  }

      

  constraint  c_frame_height {

    frame_height inside {[0:1023]}; // Constraint will fail , if you don't cap your height

    foreach (y[i]) {

      solve frame_height before x[i],h[i];

      int'(y[i] + h[i]) <= frame_height;

      y[i] inside {[0:frame_height]};

      h[i] inside {[0:frame_height]};

    }

  }

 

 constraint c_unique {

        solve unique_bit before x,y,w,h,frame_height,frame_width;

        if(unique_bit) {

          unique {x};

          unique {y};

          unique {w};

          unique {h};

        }

      }

        

  function void display();

    $display("Unique Bit:%0d",unique_bit);

    $display("Num of Requests:%0d", num_of_reqs);

    $display("Frame Height:%0d Width:%0d",frame_height,frame_width);

    foreach(x[i])

      $display("X:%04d W:%04d || Y:%04d H:%04d",x[i],w[i],y[i],h[i]);

  endfunction

        


endclass


module top;

 

  test t;

 

  initial begin

    t = new;

    if(!t.randomize()) $error("Randomization failed");

    t.display();

  end

endmodule

Saturday, August 8, 2020

Find duplicate element in system verilog array.

 

  1. Array of size 100

You have elements from 100 to 199 randomly shuffled.

One number is replaced with another number in the same range .. Find the replaced number and position. 

One Condition is that you should not use a nested loop

 

 

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
module top;
  int unsigned arr[100];
  int unsigned change_index;
  int sum;
  int tmp[$];
  
  initial begin
    std::randomize(arr) with { foreach (arr[i]) { arr[i] inside {[100:199]};}
                               unique {arr}; 
                             };
    std::randomize(change_index) with { change_index inside {[10:99]}; };
    $display("Index:%0d Val:%0d",change_index,arr[change_index]);
    sum = arr.sum();
    arr[change_index] = 110;
    
    for( int i=0; i < 100; i++) begin //{
      tmp = arr.find_index with (item == arr[i]);
      if(tmp.size() > 1) begin //{
        $display("Duplicate found: Index:%0d Val:%0d",i,arr[i]);
        if(sum > arr.sum()) $display("Index:%0d Original:%0d Duplicate:%0d",i,arr[i]+(sum-arr.sum()),arr[i]);
        else                $display("Index:%0d Original:%0d Duplicate:%0d",i,arr[i]-(arr.sum()-sum),arr[i]);
        break;
      end //}
      tmp.delete();
    end //}
                                    

  end
endmodule

Sunday, August 2, 2020

To print values based on their decimal places

If you have an integer value, say 234 and you need to get all the values based on their decimal places.
You need to simple divide the integer say 'a', with its decimal position
(a/1 ) % 10 gives you the one's digit
(a/10) %10 gives you the ten's digit and so on.....


CODE:

module top;
  int unsigned a = 234;
 
  initial begin
    $display("A_1  :%0d",(a/1)%10  );
    $display("A_10 :%0d",(a/10)%10 );
    $display("A_100:%0d",(a/100)%10);
  end
 
endmodule:top

RESULT:
A_1 :4
A_10 :3
A_100:2

Saturday, August 1, 2020

Sorting array - descending order using 1 and 2 for loops.

Code
module top;
  int a[5] = {1,2,3,4,5};
  int tmp;
 
  initial begin //{
    // 2 loops
    for(int i=0; i< $size(a)-1;i++) begin //{
      for(int j=i+1;j <$size(a);j++) begin //{
        if(a[i] < a[j]) begin //{
          tmp  = a[i];
          a[i] = a[j];
          a[j] = tmp;
        end //}
      end //}
    end //}
    $display("Array:%p",a);
   
    // 1 loop
    a[5] = {1,2,3,4,5};
    for(int i=0; i<$size(a)-1;i++) begin//{
      if(a[i] < a[i+1]) begin //{
        tmp    = a[i];
        a[i]   = a[i+1];
        a[i+1] = tmp;
        i      = -1;
      end //}
    end //}
    $display("Array:%p",a);

  end //}
endmodule:top

Result
CPU time: .230 seconds to compile + .328 seconds to elab + .297 seconds to link
Chronologic VCS simulator copyright 1991-2019
Contains Synopsys proprietary information.
Compiler version P-2019.06-1; Runtime version P-2019.06-1; Aug 2 01:52 2020
Array:'{5, 4, 3, 2, 1}
Array:'{5, 4, 3, 2, 1}
V C S S i m u l a t i o n R e p o r t
Time: 0 ns
CPU Time: 0.640 seconds; Data structure size: 0.0Mb
Sun Aug 2 01:52:17 2020

Product of arrays except self

Code looks simple after looking at the answer 😆, i copied it from
https://leetcode.com/problems/product-of-array-except-self/solution/

In the first loop, you try to multiply all the array values to your left.
Since for the first entry has no left value, we can enter 1 for array location -0.

In the second loop, all you need to do is repeat the same for right.
However, you have 2 statements here.
1st statement , gives the final product ( except self ) to the location.
For the last entry there is no entry to right, we will replace it with 1 ( R=1).
R which represents the product of values to its Right is multiplied with Z[i].
R is loaded similar to the left product logic as mentioned above.

Code
module top;
  int a[5] = { 5,4,3,2,2 };
  int z[5];
  int R=1;
  initial begin
    z[0] = 1;
    for(int i=1; i< $size(a);i++) z[i] = z[i-1] * a[i-1];
   
    for(int i=$size(a)-1;i >=0;i--) begin //{
      z[i] = z[i] * R;
      R *= a[i];
    end //}
    $display("Array:%p",a);
    $display("Products:%p",z);
  end
endmodule

Answer
# vsim -voptargs=+acc=npr
# run -all
# Array:'{5, 4, 3, 2, 2}
# Products:'{48, 60, 80, 120, 120}

Wednesday, July 29, 2020

System verilog assertion - If b is asserted in the current cycle, a must have been present anywhere between 1 - 3 cycles earlier than b.

Problem Statement:
1. A and B are 2 pulses.
If B is asserted, A must have been asserted anywhere between 1 and 3 cycles in the past.

Solution :
We cannot use $past, as the problem requires the check to be range bound.

One way around is to use an intermediate signal ( vector).

Let us try using an vector to accommodate the length of duration to be under check.
In our case, it is between 1 and 3.
Therefore , let us take an intermediate signal arr, ranging from 3:0.
Each bit position corresponds to 1 cycle.

Code snippet will be something like this:

always @(posedge clk) arr = {[arr[2:0],a};

Code for assertion :

  property check;
    @(posedge clk) (b==1) |-> $countones(arr[3:1])>=1; // omit arr[0] as this is represents the time at which 'b' is set to 1
  endproperty: check

  a_check: assert property (check);

One more way to do it is the use of sequence.triggered.
As per LRM, triggered is a method, which checks if the operand sequence has reached its end point at that point in time.

Using this, we can write a sequence and check whether it reached its endpoint before $rose(b).

sequence s_1;
  @(posedge clk) $rose(a) ##[1:3]  1;
endsequence: s_1

property test_1;
  @(posedge clk) $rose(b) |-> s_1.triggered;
endproperty: test_1

Notes:
Now the sequence s_1, checks for $rose(a) between 1 and 3 cycles.
property test_1 checks if the s_1.triggered is reached to its endpoint at $rose(a).

Code with stimulus :
module top;
  bit a;
  bit b;
  bit [3:0] arr;

  bit clk;

  initial  begin
    $timeformat(-9,3,"ns",8);
    clk <= 0;
    forever #5 clk = !clk;
  end

  initial
    $monitor("%0t - A -%0d B -%0d Arr:%0b",$time,a,b,arr);

  initial begin
    repeat(1) @(posedge clk);
    a = 1;
    repeat(1) @(posedge clk);
    a = 0;

    repeat(2) @(posedge clk); // Check should pass, as 'a' is asserted 3 cycles before 'b'.
    b = 1;
    repeat(1) @(posedge clk);
    b = 0;
  
    repeat(3) @(posedge clk); // Check should fail, as 'a' is not present in the previous 3 cycles
    b = 1;
    repeat(1) @(posedge clk);
    b = 0;
   
    repeat(1) @(posedge clk); // Check should fail, as 'a' and 'b' are asserted in the same cycle.
    a = 1; b = 1;
    repeat(1) @(posedge clk);
    a = 0; b = 0;
    repeat(4) @(posedge clk);
    $finish;
  end

  always @(posedge clk) begin
    arr <= { arr[2:0], a };
  end

  property check;
    @(posedge clk) (b==1) |-> $countones(arr[3:1])>=1; // omit arr[0] as this is represents the time at which 'b' is set to 1
  endproperty: check

  a_check: assert property (check);
endmodule

Result :

Compiler version L-2016.06; Runtime version L-2016.06;  Jul 29 04:18 2020
0ns - A -0 B -0 Arr:0
5ns - A -1 B -0 Arr:1
15ns - A -0 B -0 Arr:10
25ns - A -0 B -0 Arr:100
35ns - A -0 B -1 Arr:1000
45ns - A -0 B -0 Arr:0
75ns - A -0 B -1 Arr:0
"assertion_past_value.sv", 51: top.a_check: started at 85000ps failed at 85000ps
        Offending '($countones(arr[3:1]) >= 1)'
85ns - A -0 B -0 Arr:0
95ns - A -1 B -1 Arr:1
"assertion_past_value.sv", 51: top.a_check: started at 105000ps failed at 105000ps
        Offending '($countones(arr[3:1]) >= 1)'
105ns - A -0 B -0 Arr:10
115ns - A -0 B -0 Arr:100
125ns - A -0 B -0 Arr:1000
135ns - A -0 B -0 Arr:0
$finish called from file "assertion_past_value.sv", line 40.
$finish at simulation time    145ns

Constraint to have N elements distributed in M bins

Code to distribute N elements into M bins, you add unique keyword to have each bin will have unique number of elements. class test; param...